We don't guess.
We verify.
AI Architect builds agents that prove what they claim. 97.8% recall on LongMemEval, 5 fused retrieval signals, zero LLM-judges-LLM. Every output is traceable, every claim is checked — by deterministic algorithms, not by another model's opinion.
Four principles. No exceptions.
Every claim has a source.
If an agent states a fact, that fact ties back to a memory, a file, a commit, or a citation. No assertion lives without a trail.
Zero LLM-judges-LLM.
Verification is deterministic: graph analysis, semantic checks, atomic claim decomposition. We don't ask one model whether another model is right.
Compounding context.
Cortex applies neuroscience — spreading activation, dream cycles, microglial pruning — so agents remember what worked, not just what happened.
Built for regulated work.
Every PRD, PR, decision and reasoning step is logged and reviewable. Designed against the same bar as financial-systems software.
Watch an agent prove its work.
This is the actual verification report from a generated PRD. 64 atomic claims, decomposed and checked against six independent algorithms. The full audit trail lives next to the deliverable — not buried in a log file.
Cortex memory. Zetetic reasoning. Verified pipeline. One platform.
Cortex
Persistent memory for Claude Code.
LongMemEval
Zetetic Agents
Reasoning patterns. One epistemic standard.
97 + 19 specialists
Automatised Pipeline
Read-only codebase intelligence.
10 stages
PRD Spec Generator
Stateless reducer. Feature description → verified PRD.
9 pipeline steps
Hire us to build it. Or build it yourself with our tools.
Every component is open source, MIT-licensed, and shipping in production. The choice is whether you want the system handed to you — or the keys to do it yourself.
We build the
agent with you.
For operators who know AI should help — but don't want to spend six months stitching tutorials together. We design and ship the agent against your real infrastructure. Same engineering bar as the financial systems we build by day.
- Discovery — find the one workflow worth automating
- Built against your CRM, data, internal tools — not a sandbox
- Verification baked in: every action is auditable
- Hand-over with documentation & 30 days of post-launch support
Grab the templates.
Ship faster.
If you build with AI yourself, use the same components we use in production. Cortex, Zetetic Agents, Automatised Pipeline, and PRD Spec Generator — fully documented, MIT-licensed, no telemetry, no lock-in.
- Cortex — persistent memory with neuroscience-backed retrieval
- Zetetic Agents — 116 reasoning patterns, one epistemic standard
- Automatised Pipeline — read-only codebase intelligence (Rust MCP)
- PRD Spec Generator — 11 MCP tools, 9-step pipeline + 2-phase multi-judge verification, specialized panels by claim type
From "we should try AI" to a system you can audit. Four stages.
Every engagement runs the same protocol — same one our open-source pipeline runs internally. You see working software early, you see verification at every step, and you own everything when we're done.
Discovery & framing
We map the workflow where an agent will actually move the needle. No slide decks, no AI theater. Output: a one-page spec with success criteria you can measure against.
Build with verification
Agent designed against your real systems. Every component ships with tests, provenance and an audit log. You watch it work in /cortex-visualize as we go.
Hand-over
Deployed to your infra. Runbooks, dashboards, and a verification report on every shipped feature. Your team is trained on how to extend it — not dependent on us forever.
Compounding
Cortex memory means the agent gets smarter every week without retraining. We stay on call for the first 30 days; after that, you own a system you can audit, evolve, and keep running.
founder
I ship critical systems by day. I research how agents should think by night.
By day I build software in financial infrastructure, where "mostly works" never ships. Every system has to be tested, verified, auditable. Or it doesn't go live.
By night I apply that same bar to AI. I started AI Architect because I kept seeing the same anti-pattern: teams treating agents like demos, stacking prompts on prompts, asking another LLM whether the first one got it right, and wondering why nothing held up in production.
The work here is zetetic — every claim is investigated, never assumed. The tools are open source because the frontier should be shared. The consulting exists because some teams need the system built with them, not handed a repo and a prayer.
"An agent without memory isn't intelligent. An agent without verification isn't trustworthy. I'm only interested in building both."
The paper behind the numbers.
Read it. Help us publish.
Stage-Aware Context Assembly for Long-Context Memory Retrieval
A new method for assembling retrieval context across long-running, multi-session conversations. Combines vector + lexical + heat-decay + temporal + entity signals through a stage-aware fusion that adapts to what the question is actually asking for.
The temporal assembler component beats the BEAM oracle baseline (0.471 MRR vs the paper's 0.353, +33.4%), and the full pipeline reaches 97.8% Recall@10 on LongMemEval — +19.4 points over the published best. All retrieval-only metrics, no LLM-as-judge.
The science behind the system.
Cortex draws from 41 peer-reviewed papers across neuroscience, memory research and AI evaluation. A few of the load-bearing ones:
Before you
book a call.
claude plugin marketplace add cdeust/Cortex && claude plugin install cortex.distribution_suspicious flag catches confirmatory bias. NFR claims never receive PASS — only SPEC-COMPLETE or NEEDS-RUNTIME.UNSOURCED / MAGIC_NUMBER / TODO_NO_REF.Cortex:
claude plugin marketplace add cdeust/Cortex && claude plugin install cortex (requires PostgreSQL + pgvector).Zetetic:
claude plugin marketplace add cdeust/zetetic-team-subagents && claude plugin install zetetic-team-subagents.Automatised Pipeline:
claude plugin marketplace add cdeust/automatised-pipeline && claude plugin install automatised-pipeline (the plugin builds the Rust binary on first install; Rust 1.94+ and CMake required).PRD Spec Generator:
claude plugin marketplace add cdeust/prd-spec-generator && claude plugin install prd-spec-generator (Node.js 20.x or 22.x).All four interoperate — Cortex remembers, Zetetic reasons, Pipeline maps the codebase, PRD Spec Generator adjudicates the spec.
Tell us what you want the agent to do.
We'll tell you if it can be verified.
A 30-minute call. No pitch deck, no commitment. If your problem doesn't fit what we do, we'll point you somewhere that does.