Demonstrations of using CodeLogician and ImandraX for formal verification of real-world software, driven by AI coding agents.
CodeLogician is a neurosymbolic governance framework that helps AI coding agents reason about complex software using mathematical logic and automated reasoning. ImandraX is the underlying formal reasoning engine.
Hands-on tutorial for the codelogician eval command and all its subcommands.
A step-by-step guide that walks through the full eval pipeline — from type-checking IML code to proving properties to generating tests — using minimal, self-contained examples. Also includes a walkthrough of the stripe/ example to show the same pipeline on real-world business logic.
Covers:
- Error feedback — what syntax and type errors look like in the CLI
eval check— type-checking and admitting IML definitionseval list-vg/eval check-vg— listing and running verification goals (verifyandinstance)eval list-decomp/eval check-decomp— region decomposition of function input spaceseval gen-test— generating Python/TypeScript test cases from decomposition regions- Stripe walkthrough — running the full pipeline on the payment flow model (bug discovery, 14 proofs, 84 regions, test generation)
| File | Description |
|---|---|
eval-tutorial.md |
The full tutorial with runnable commands |
iml/errors.iml |
Syntax and type error examples (uncomment to trigger) |
iml/basics.iml |
Simple functions (my_abs, clamp) for eval check |
iml/verify_abs.iml |
verify and instance goals on my_abs |
iml/decomp_clamp.iml |
Region decomposition on clamp (3 regions) |
iml/decomp_discount.iml |
Decomposition with record types and ~assuming (4 regions) |
# Install CodeLogician
curl -fsSL https://codelogician.dev/codelogician/install.sh | sh
# Set your API key (free tier at https://universe.imandra.ai)
export IMANDRA_UNI_KEY=your_key_here
# Run the tutorial examples
cd intro
codelogician eval check iml/basics.iml
codelogician eval check-vg --check-all iml/verify_abs.iml
codelogician eval check-decomp --index 1 iml/decomp_clamp.iml
codelogician eval gen-test iml/decomp_discount.iml -f discount -l pythonFormally verifying a Stripe payment flow with Claude and CodeLogician.
Takes a production Stripe payment processing flow (Flask/Python, ~700 lines) through a complete formal verification pipeline:
- Formalize — Claude translates the Python payment state machine into IML (Imandra Modeling Language)
- Verify — Claude writes verification goals and runs them through ImandraX, which catches a real bug (missing
amount_refundedreset incapture_payment) and logically flawed properties - Extend — New features added (high-risk multi-approval policy, 3DS/SCA challenge flow) and verified with 14 proof obligations
- Decompose — ImandraX region decomposition discovers 84 disjoint execution paths (53 for capture alone)
- Test — 84 test cases generated from the decomposition, all passing against the Python implementation
| File | Description |
|---|---|
stripe_payment_flow.py |
Original production Stripe payment flow (Flask API) |
stripe_payment_flow_updated.py |
Updated Python implementation with new features |
stripe_flow_original.iml |
IML formal model before bug fix (1 VG refuted) |
stripe_flow_updated.iml |
IML formal model after fix + new features (14/14 VGs proved) |
test_stripe_flow.py |
84 pytest cases generated by region decomposition |
workflow_diagram.png |
Claude + CodeLogician workflow diagram |
medium_post.md |
Writeup of the full process |
# Install CodeLogician (requires IMANDRA_UNI_KEY from https://universe.imandra.ai)
curl -fsSL https://codelogician.dev/codelogician/install.sh | sh
# Check the original model (has the bug)
codelogician eval check stripe/stripe_flow_original.iml
codelogician eval check-vg --check-all stripe/stripe_flow_original.iml # 1 refuted
# Check the updated model (bug fixed, new features)
codelogician eval check stripe/stripe_flow_updated.iml
codelogician eval check-vg --check-all stripe/stripe_flow_updated.iml # 14/14 proved
# Run region decomposition
codelogician eval check-decomp --check-all stripe/stripe_flow_updated.iml # 84 regions
# Generate test cases
codelogician eval gen-test stripe/stripe_flow_updated.iml -f step -l python
# Run the tests
cd stripe && python -m pytest test_stripe_flow.py -v # 84/84 passedIndustry case studies from codelogician.dev/docs/industry-case-studies — formal verification applied to real financial systems.
Nine case studies demonstrating bug discovery, redundancy detection, and behavioral analysis across trading, banking, and regulatory domains. Each directory contains an IML formal model with verification goals and/or region decompositions, plus a README with results.
| Case Study | Key Result | Technique |
|---|---|---|
| algorithmic-trading/ | Redundant XYZ trading rule PROVED removable | Verification |
| lse-gtt-order-expiry/ | Race condition in GTT expiry vs auction uncross | State machine modeling |
| bank-account-classification/ | 21% code growth → 190% region growth (19→55) | Decomposition |
| lse-stop-orders/ | Simultaneous buy+sell election bug REFUTED | Verification + counterexample |
| interest-rate-swap/ | No weekends PROVED, naive monotonicity REFUTED | Verification |
| multilateral-netting/ | Float precision bug; exact arithmetic in IML | Modeling |
| margin-account/ | Solvency safety PROVED, gap risk safe | Verification + instance |
| exchange-fee-schedule/ | Penny Jump PROVED, 5 fee regions found | Verification + decomposition |
| mifid-ii-reporting/ | Jurisdiction safety PROVED, 12 reporting regions | Verification + decomposition |
# Run any case study (example: interest rate swap)
cd case-studies/interest-rate-swap
codelogician-lite check irs_scheduler.iml
codelogician-lite check-vg irs_scheduler.iml- CodeLogician — Neurosymbolic governance framework for AI-powered coding
- ImandraX — Automated reasoning engine
- Tackle Complexity with Math and Logic — The motivation behind this approach
- Imandra Universe — Get an API key (free tier available)