Skip to content

imandra-ai/codelogician-demos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CodeLogician Demos

Demonstrations of using CodeLogician and ImandraX for formal verification of real-world software, driven by AI coding agents.

CodeLogician is a neurosymbolic governance framework that helps AI coding agents reason about complex software using mathematical logic and automated reasoning. ImandraX is the underlying formal reasoning engine.

Demos

Hands-on tutorial for the codelogician eval command and all its subcommands.

A step-by-step guide that walks through the full eval pipeline — from type-checking IML code to proving properties to generating tests — using minimal, self-contained examples. Also includes a walkthrough of the stripe/ example to show the same pipeline on real-world business logic.

Covers:

  1. Error feedback — what syntax and type errors look like in the CLI
  2. eval check — type-checking and admitting IML definitions
  3. eval list-vg / eval check-vg — listing and running verification goals (verify and instance)
  4. eval list-decomp / eval check-decomp — region decomposition of function input spaces
  5. eval gen-test — generating Python/TypeScript test cases from decomposition regions
  6. Stripe walkthrough — running the full pipeline on the payment flow model (bug discovery, 14 proofs, 84 regions, test generation)

Files

File Description
eval-tutorial.md The full tutorial with runnable commands
iml/errors.iml Syntax and type error examples (uncomment to trigger)
iml/basics.iml Simple functions (my_abs, clamp) for eval check
iml/verify_abs.iml verify and instance goals on my_abs
iml/decomp_clamp.iml Region decomposition on clamp (3 regions)
iml/decomp_discount.iml Decomposition with record types and ~assuming (4 regions)

Quick start

# Install CodeLogician
curl -fsSL https://codelogician.dev/codelogician/install.sh | sh

# Set your API key (free tier at https://universe.imandra.ai)
export IMANDRA_UNI_KEY=your_key_here

# Run the tutorial examples
cd intro
codelogician eval check iml/basics.iml
codelogician eval check-vg --check-all iml/verify_abs.iml
codelogician eval check-decomp --index 1 iml/decomp_clamp.iml
codelogician eval gen-test iml/decomp_discount.iml -f discount -l python

Formally verifying a Stripe payment flow with Claude and CodeLogician.

Takes a production Stripe payment processing flow (Flask/Python, ~700 lines) through a complete formal verification pipeline:

  1. Formalize — Claude translates the Python payment state machine into IML (Imandra Modeling Language)
  2. Verify — Claude writes verification goals and runs them through ImandraX, which catches a real bug (missing amount_refunded reset in capture_payment) and logically flawed properties
  3. Extend — New features added (high-risk multi-approval policy, 3DS/SCA challenge flow) and verified with 14 proof obligations
  4. Decompose — ImandraX region decomposition discovers 84 disjoint execution paths (53 for capture alone)
  5. Test — 84 test cases generated from the decomposition, all passing against the Python implementation

Files

File Description
stripe_payment_flow.py Original production Stripe payment flow (Flask API)
stripe_payment_flow_updated.py Updated Python implementation with new features
stripe_flow_original.iml IML formal model before bug fix (1 VG refuted)
stripe_flow_updated.iml IML formal model after fix + new features (14/14 VGs proved)
test_stripe_flow.py 84 pytest cases generated by region decomposition
workflow_diagram.png Claude + CodeLogician workflow diagram
medium_post.md Writeup of the full process

Running it yourself

# Install CodeLogician (requires IMANDRA_UNI_KEY from https://universe.imandra.ai)
curl -fsSL https://codelogician.dev/codelogician/install.sh | sh

# Check the original model (has the bug)
codelogician eval check stripe/stripe_flow_original.iml
codelogician eval check-vg --check-all stripe/stripe_flow_original.iml  # 1 refuted

# Check the updated model (bug fixed, new features)
codelogician eval check stripe/stripe_flow_updated.iml
codelogician eval check-vg --check-all stripe/stripe_flow_updated.iml   # 14/14 proved

# Run region decomposition
codelogician eval check-decomp --check-all stripe/stripe_flow_updated.iml  # 84 regions

# Generate test cases
codelogician eval gen-test stripe/stripe_flow_updated.iml -f step -l python

# Run the tests
cd stripe && python -m pytest test_stripe_flow.py -v  # 84/84 passed

Industry case studies from codelogician.dev/docs/industry-case-studies — formal verification applied to real financial systems.

Nine case studies demonstrating bug discovery, redundancy detection, and behavioral analysis across trading, banking, and regulatory domains. Each directory contains an IML formal model with verification goals and/or region decompositions, plus a README with results.

Case Study Key Result Technique
algorithmic-trading/ Redundant XYZ trading rule PROVED removable Verification
lse-gtt-order-expiry/ Race condition in GTT expiry vs auction uncross State machine modeling
bank-account-classification/ 21% code growth → 190% region growth (19→55) Decomposition
lse-stop-orders/ Simultaneous buy+sell election bug REFUTED Verification + counterexample
interest-rate-swap/ No weekends PROVED, naive monotonicity REFUTED Verification
multilateral-netting/ Float precision bug; exact arithmetic in IML Modeling
margin-account/ Solvency safety PROVED, gap risk safe Verification + instance
exchange-fee-schedule/ Penny Jump PROVED, 5 fee regions found Verification + decomposition
mifid-ii-reporting/ Jurisdiction safety PROVED, 12 reporting regions Verification + decomposition
# Run any case study (example: interest rate swap)
cd case-studies/interest-rate-swap
codelogician-lite check irs_scheduler.iml
codelogician-lite check-vg irs_scheduler.iml

Learn More

About

CodeLogician demos and tutorials

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages