Manukai AG

Bridging 2D and 3D: Generative AI for Manufacturing Intelligence

Inspiration

Modern manufacturing still depends heavily on 2D engineering drawings, even when 3D CAD models are available. Critical product and manufacturing information (PMI)—such as hole types, thread specifications, and tolerances—exists only as visual annotations in PDFs.

This creates a major bottleneck: engineers must manually interpret drawings and map them to 3D geometry. This process is slow, error-prone, and does not scale.

We were inspired to bridge this gap and move towards a true digital manufacturing pipeline, where design intent becomes directly machine-readable.

What it does

This project automatically:

  • Extracts hole and thread annotations from PDF drawings
  • Understands manufacturing intent (dimensions, tolerances, threads)
  • Maps them to corresponding 3D features in STEP models
  • Outputs structured, machine-readable JSON with confidence and traceability

How we built it

We designed a four-layer hybrid pipeline that combines the strengths of traditional document intelligence with vision-language models and geometric reasoning:

  1. Azure Document Intelligence for OCR with precise bounding boxes
  2. LLM multi-pass vision extraction with OCR regex pre-scanning, verification sweeps, cropped-region analysis, and OCR-vs-LLM reconciliation
  3. Pure-Python STEP parser that resolves ISO 10303-21 entity chains to extract cylindrical surfaces and group them into logical holes
  4. Hybrid deterministic + LLM correlation that uses diameter matching, count filtering, and counterbore pairing for unambiguous cases, with targeted LLM disambiguation for edge cases

Every layer is designed to fail gracefully — if Azure DI is unavailable, the pipeline continues in vision-only mode; ambiguous matches are flagged with lower confidence rather than forced.

Challenges we ran into

  • GD&T is not natural language. Engineering drawing annotations use specialised symbols (∅, ↧, feature control frames) that OCR engines frequently misread. We solved this with a regex pre-scan that creates a "ground-truth checklist" injected into every LLM prompt.
  • Same diameter, different holes. Real drawings often have multiple groups of identically-sized holes in different locations. Distinguishing them requires cross-referencing datum references, position tolerances, and spatial context.
  • Deduplication across multi-pass extraction. Our multi-pass approach (per-page + verification + cropped-region + reconciliation) greatly improves recall but introduces duplicates. We implemented three-stage deduplication (within-page, cross-page, diameter-based) to balance recall and precision.
  • STEP files without a CAD kernel. We built a pure-Python STEP parser to avoid heavy compiled dependencies, which required resolving complex entity reference chains.

Accomplishments that we're proud of

  • Built a full end-to-end pipeline (PDF → STEP → JSON)
  • Achieved high recall using multi-pass extraction strategy
  • Designed a lightweight STEP parser without CAD libraries
  • Enabled traceable, explainable AI outputs
  • Balanced deterministic logic with LLM reasoning effectively

What we learned

  • Hybrid AI systems outperform pure LLM approaches in industrial problems
  • Multi-pass + verification significantly improves extraction accuracy
  • Geometry + language understanding is a powerful combination
  • Explainability (confidence + evidence) is critical for trust

What's next for BreakThrough

  • Custom Azure DI model trained on engineering drawings for direct GD&T symbol recognition
  • Two-stage extraction (detect all diameters first, then interpret each one)
  • Spatial clustering using 3D positions from STEP to disambiguate same-diameter hole groups
  • Confidence-based filtering using STEP diameters as ground truth to prune false positive annotations
  • View-aware extraction that processes each drawing view (SECTION A-A, DETAIL B) as an independent context

Detailed Architecture

ReadMe

Built With

  • azuredocumentintelligence
  • llm
  • openai
  • pydantic
  • pymupdf
  • python
  • streamlit
Share this project:

Updates