Hermes Systems

E2E pipeline
Desktop client architecture
High level system overview
Hermes

Inspiration

Every engineer stepping into hardware hits the same wall: you know exactly what you want your FPGA to do, and then you spend days or weeks wrestling with architecture decisions, timing constraints, and implementation tradeoffs — not because the idea was bad, but because the design stage still demands slow, manual iteration that existing tools barely touch.

We built Hermes Systems to collapse that gap. You define intent and constraints for the FPGA; Hermes explores the design space and delivers optimized implementations — keeping you in the loop where your judgment matters, and automating the grind where it doesn't.

What it does

Hermes Systems takes a plain-English description of an FPGA function and returns a structured hardware specification instantly.

Domain	Input brief	Output
Education	"Blink an LED when a button is pressed"	GPIO, debounce filter, 1-bit register — ~25 LUTs
HFT	"Sub-100ns order execution pipeline with market data ingestion"	PCIe DMA core, FIFO arbitration, clock domain crossing — latency estimate + resource map
Aerospace	"Redundant flight control signal processor with watchdog"	Triple modular redundancy logic, CRC checker, GPIO isolation — safety flag analysis

For any brief, Hermes System returns:

🧩 Functional blocks required
📊 Estimated resource usage across LUTs, FFs, DSPs, BRAMs, and PLLs
⏱️ Latency analysis for timing-critical paths
📶Physical Board a demonstration of optimal physical implementation

Targeted initially at the DE-10 / Cyclone V platform, with architecture designed for extension to Xilinx Ultrascale+ and Intel Agilex — the platforms that power production HFT and aerospace system, Hermes System acts as a rapid feasibility layer between human intent and HDL development.

How we built it

We started with ground truth — because in high-stakes domains, "probably right" isn't good enough.

We built a 10-design validation set spanning four complexity tiers, each with real Quartus synthesis reports to compare against:

Tier	Example	Key resource
1 — Combinational	Debounced GPIO, binary counter	LUTs only
2 — FSM/ Protocol	UART transmitter, SPI master, PWM	LUTs + FFs
3 — Memory / Comms	SPI master, 256-byte BRAM	+ BRAM inference
4 — DSP	8-tap FIR filter, VGA generator	+ DSPs, PLL

The pipeline is a multi-stage LLM architecture with a structured output schema designed around Cyclone V hardware primitives. Training data came from:

Intel/Altera documentation
Open-source RTL repositories and reference designs
Quartus synthesis reports from our hand-built validation set

Backend: FastAPI + Uvicorn, Pydantic models, OpenAI Python SDK (async), python-dotenv. Frontend: React 18 + Three.js + WaveDrom + Mermaid, all CDN-loaded with no build step, deployed via Vercel. Toolchain: iverilog + vvp for simulation, Yosys + nextpnr-ice40 + icepack for synthesis and bitstream. Observability: Elasticsearch across three indices, designs, events, and error-fix cache.

The frontend is a lightweight web interface delivering real-time brief input and structured specification output.

Challenges we ran into

Turns out "train a model on FPGA designs" is a sentence that hides enormous pain, and in aerospace, defense or HFT, inaccuracy has real consequences.

The design intent problem: Training data gives you resource counts but not reasoning — a synthesis report tells you a DSP block was used, not why it was chosen over LUT-based logic. Teaching the model that distinction required careful schema design, not just more data.

The scoping problem: We had to cut the placement and routing layer early. It's a combinatorially hard, heavily proprietary problem that teams at Intel have spent years on. Recognising that boundary cleanly was its own challenge — knowing what not to build in 48 hours is harder than it sounds.

The ambiguity problem: "Display a pattern on an LED" is a completely different design depending on whether you mean one GPIO pin or a multiplexed 8×8 matrix. Early versions of the model answered confidently — and answered the wrong version of the question.

Accomplishments that we're proud of

The model correctly predicts resource category on 9 out of 10 reference designs — including correctly flagging the PLL requirement for the VGA signal generator, which is the kind of non-obvious inference we were targeting.

It's easy to build a system that confidently outputs wrong answers. It's harder to build one that knows what it doesn't know.

When we fed Hermes Systems an under-specified brief, watching it compare against preexisting designs and surface ambiguities rather than guess felt like the moment it became genuinely useful — not just a demo.

The self-healing Elasticsearch error cache was the moment the system became genuinely operational rather than just functional. Watching a failed synthesis retry succeed because a cached fix was automatically injected into the retry prompt — with no human in the loop — demonstrated what treating observability as infrastructure, not afterthought, actually unlocks.

We're also proud of building a principled evaluation methodology in 48 hours. Comparing predicted vs actual synthesis results across 10 reference designs gave us something concrete to stand behind beyond cherry-picked examples.

What we learned

Hardware and software are closer than STEM students are usually taught. The core problem — mapping human intent to executable specification — is the same one that drove every wave of software tooling for the past thirty years.

Copilot didn't replace developers. It removed the friction from the first draft.

Hermes Systems is that same idea, applied one layer down — lowering the barrier for software-literate developers to contribute meaningfully in the world of hardware. We learned that hardware domain knowledge and software architecture instincts are genuinely complementary, and that pairing them with the right tooling produces something neither discipline could reach alone.

What's next for Hermes Systems

Expanding to Xilinx Ultrascale+ is the obvious next hardware target — the pipeline architecture is platform-agnostic, it's the training data that's device-specific.

We want Hermes to sit at the start of every FPGA project — the equivalent of a system design diagram for hardware. The roadmap:

🗣️ Conversational ambiguity resolution — clarifying dialogue before spec generation
📁 Quartus project template generation — closing the loop from spec to synthesisable starting point
🔁 Feedback loop — real synthesis results improving model predictions over time

The commercial case is clear: companies that need FPGA performance, but are bound by tight development timelines or are spending millions on engineers for FPGA design iteration. Right now, engineers work for months just to design FPGAs. Hermes Systems is the system that accelerates that development to weeks.