Skip to content

Metrics Explained

EdgeParse benchmarks use three complementary metrics that each measure a different aspect of extraction quality.

Measures: Reading order and text completeness.

NID compares the extracted plain text against a ground-truth reference using normalized compression distance. A score of 1.0 means the extracted text perfectly matches the reference.

ScoreMeaning
0.95+Excellent — near-perfect text extraction
0.90–0.95Good — minor differences
0.80–0.90Fair — some text missing or reordered
< 0.80Poor — significant extraction errors

Measures: Table structure accuracy.

TEDS computes the tree edit distance between extracted HTML tables and ground-truth tables, normalized to a 0–1 similarity score. It penalizes missing rows, merged cells, and misaligned columns.

ScoreMeaning
0.90+Excellent — tables nearly identical
0.80–0.90Good — minor structural differences
0.60–0.80Fair — some rows/columns misaligned
< 0.60Poor — significant structural errors

Measures: Document structure / heading detection accuracy.

MHS compares the heading hierarchy (H1–H6) in the extracted output against the ground truth. It rewards correct heading levels and penalizes missing or incorrectly-leveled headings.

ScoreMeaning
0.90+Excellent — heading hierarchy correct
0.80–0.90Good — minor level mismatches
0.60–0.80Fair — some headings missing
< 0.60Poor — heading detection unreliable

The overall score is the arithmetic mean of NID, TEDS, and MHS:

Overall = (NID + TEDS + MHS) / 3

Each metric is weighted equally because they measure orthogonal quality dimensions:

  • NID → “Did we extract the right text?”
  • TEDS → “Did we get the tables right?”
  • MHS → “Did we detect the document structure?”