Question 1

What is EdgeParse?

Accepted Answer

EdgeParse is a high-performance PDF-to-structured-data extraction engine written in Rust. It converts complex PDFs into clean, structured JSON, Markdown, or HTML in milliseconds without ML dependencies.

Question 2

How fast is EdgeParse compared to other PDF parsers?

Accepted Answer

EdgeParse processes 40+ pages per second — 10 to 100× faster than Python-based alternatives like Docling or Marker. It achieves 0.026s average processing time per document.

Question 3

What programming languages does EdgeParse support?

Accepted Answer

EdgeParse provides native bindings for Python (via PyO3), Node.js (via NAPI-RS), a standalone CLI binary, and can be used directly as a Rust library crate.

Question 4

Does EdgeParse require GPU or ML models?

Accepted Answer

No. EdgeParse is a rule-based extraction engine with zero ML dependencies. No GPU, no Java, no Poppler, no Tesseract required. Just pip install edgeparse and go.

Running Benchmarks

Prerequisites

Quick Start

Running a Single Tool

Running All Tools

Custom PDFs

Viewing Reports

Thresholds