Batch Processing
CLI Batch Processing
Section titled “CLI Batch Processing”# Process all PDFs in a directoryedgeparse ./documents/*.pdf -f markdown --output-dir ./results/
# Process with specific formatedgeparse ./invoices/*.pdf -f json --output-dir ./output/Python Batch Processing
Section titled “Python Batch Processing”import edgeparsefrom pathlib import Path
input_dir = Path("documents")output_dir = Path("results")output_dir.mkdir(exist_ok=True)
for pdf_path in input_dir.glob("*.pdf"): edgeparse.convert_file(str(pdf_path), str(output_dir), format="markdown") print(f"Processed: {pdf_path.name}")Node.js Batch Processing
Section titled “Node.js Batch Processing”import { convert } from "edgeparse";import { readdirSync, writeFileSync } from "fs";import { join, basename } from "path";
const inputDir = "documents";const outputDir = "results";
const files = readdirSync(inputDir).filter(f => f.endsWith(".pdf"));
for (const file of files) { const result = convert(join(inputDir, file), { format: "markdown" }); const outName = basename(file, ".pdf") + ".md"; writeFileSync(join(outputDir, outName), result); console.log(`Processed: ${file}`);}Performance Tips
Section titled “Performance Tips”- EdgeParse processes each page in parallel using Rayon
- For large batches, the CLI handles parallelism automatically
- Expect ~40 pages/second on modern hardware (single-threaded)