Rust API
Crate: edgeparse-core
Section titled “Crate: edgeparse-core”convert(path, config) -> Result<Document>
Section titled “convert(path, config) -> Result<Document>”Parse a PDF file and return a structured Document.
use std::path::Path;use edgeparse_core::api::config::ProcessingConfig;use edgeparse_core::convert;
let config = ProcessingConfig::default();let doc = convert(Path::new("document.pdf"), &config)?;
println!("Pages: {}", doc.number_of_pages);println!("Elements: {}", doc.kids.len());Document struct
Section titled “Document struct”| Field | Type | Description |
|---|---|---|
file_name | String | Source file name |
number_of_pages | usize | Total pages |
author | Option<String> | PDF author metadata |
title | Option<String> | PDF title metadata |
kids | Vec<ContentElement> | Extracted elements in reading order |
Output Renderers
Section titled “Output Renderers”use edgeparse_core::output;
// Markdownlet md = output::markdown::to_markdown(&doc)?;
// HTMLlet html = output::html::to_html(&doc)?;
// JSON (legacy-compatible)let json = output::legacy_json::to_legacy_json_string(&doc, "document")?;Crate: pdf-cos
Section titled “Crate: pdf-cos”Low-level PDF COS object parser. Used internally by edgeparse-core.
[dependencies]edgeparse-core = "0.1"