Skip to content

Running Benchmarks

  • Python 3.10+
  • uv (recommended) or pip
  • EdgeParse CLI installed
Terminal window
cd benchmark
uv sync # install dependencies
uv run python run.py --tool edgeparse
Terminal window
uv run python run.py --tool edgeparse

Available tools: edgeparse, docling, marker, edgequake, opendataloader, pymupdf4llm, markitdown.

Terminal window
uv run python compare_all.py

This compares all tools against the ground-truth reference set and generates an HTML report in reports/.

Place your PDF files in benchmark/pdfs/ and matching ground-truth Markdown in benchmark/ground-truth/markdown/.

Terminal window
# Run with custom corpus
uv run python run.py --tool edgeparse --pdf-dir ./my-pdfs

Reports are generated as HTML files:

Terminal window
open reports/benchmark-latest.html

The thresholds.json file defines minimum acceptable scores:

{
"nid": 0.85,
"teds": 0.70,
"mhs": 0.75,
"overall": 0.80
}

CI will fail if EdgeParse scores drop below these thresholds.