Quick Start: Python
Installation
Section titled “Installation”pip install edgeparseRequirements: Python 3.9+ · No additional system dependencies.
Parse a PDF
Section titled “Parse a PDF”import edgeparse
# Get Markdown stringmarkdown = edgeparse.convert("document.pdf", format="markdown")print(markdown)
# Get structured JSON stringjson_str = edgeparse.convert("document.pdf", format="json")
# Get HTML stringhtml = edgeparse.convert("document.pdf", format="html")Write to File
Section titled “Write to File”import edgeparse
# Save to output directory (returns path of saved file)out_path = edgeparse.convert_file("document.pdf", "output/", format="json")print(f"Saved to: {out_path}")Output Formats
Section titled “Output Formats”| Format | Description |
|---|---|
"markdown" | Clean Markdown with table support |
"json" | Structured JSON with full metadata |
"html" | Semantic HTML |
"text" | Plain text (reading order) |
Next Steps
Section titled “Next Steps”- JSON Schema Reference — understand every field
- Benchmark Results — accuracy & speed data
- Try Live Demo — parse PDFs in your browser with WebAssembly
- Enterprise — self-hosted deployment, priority support by Elitizon