Roadmap
Upcoming features and development priorities
Coming Soon
Q3 2026
| Feature | Description | Status |
|---|---|---|
| Structure Validation | Verify and repair PDF tag trees | Planned |
| TOC Extraction | Auto-detect document navigation structure | Planned |
Recently Shipped
| Feature | Description | Version | Date |
|---|---|---|---|
| Auto-Tagging Engine | Generate accessible Tagged PDFs from untagged PDFs (--format tagged-pdf) | Latest | 2026-Q2 |
| Apache 2.0 License | License migration from MPL-2.0 to Apache-2.0 | v2.0.0 | 2026-03-11 |
| Header/Footer Control | --include-header-footer option for output generation | v1.10.0 | 2026-02-04 |
| Equation & Figure AI | LaTeX formula extraction and AI chart/image description via hybrid mode | v1.8.0 | 2026-01-13 |
| Hybrid Mode Options | --hybrid-mode full for formula/picture enrichments, --hybrid-ocr | v1.8.0 | 2026-01-13 |
| OCR for Scanned PDFs | Extract text from image-based PDFs via hybrid mode | v1.6.0 | 2026-01-05 |
| Table AI | ML-assisted detection for borderless and merged-cell tables via hybrid mode | v1.6.0 | 2026-01-05 |
| XY-Cut++ Reading Order | Improved multi-column layout detection | v1.4.0 | 2025-12-19 |
| Base64 Image Embedding | Embed images directly in JSON/HTML/Markdown output | v1.4.0 | 2025-12-19 |
| Tagged PDF Support | Native structure tag extraction | v1.3.0 | 2025-11-21 |
| Benchmarks & Datasets | Transparent evaluations using open datasets and standardized metrics | v1.3.0 | 2025-11-21 |
| AI Safety Filters | Auto-filter hidden text and prompt injection content | v1.0.0 | 2025-09-16 |
Feature Requests
Have a feature request? Open an issue on GitHub.