Route, plan, verify, and execute multi-agent workflows with learned routing, formal safety proofs, and full observability.
Quick Start Β β’Β Architecture Β β’Β Features Β β’Β Benchmarks Β β’Β Examples Β β’Β Contributing
Lattice is a Python framework for building multi-agent systems with formal safety guarantees. It combines a contextual bandit router that learns optimal agent assignment from execution feedback, a hybrid ReAct + Plan-and-Solve planner for task decomposition, and a Z3-powered verifier that proves DAG acyclicity, budget feasibility, and capability constraints before any code executes. The result is a production-grade orchestration layer where agents are routed intelligently, plans are verified mathematically, and every step is traced end-to-end via OpenTelemetry.
graph LR
subgraph Input
T[Task Input]
end
subgraph Orchestration
R[Router<br/><i>Contextual Bandits</i>]
P[Planner<br/><i>ReAct + Plan-and-Solve</i>]
V[Verifier<br/><i>Z3 Safety Proofs</i>]
E[Executor<br/><i>DAG Engine</i>]
end
subgraph Agents
A1[Tool Agent]
A2[Critic Agent]
A3[Human Agent]
A4[Custom Agents]
end
subgraph Output
O[Result + Traces]
end
T --> R
R --> P
P --> V
V --> E
E --> A1 & A2 & A3 & A4
A1 & A2 & A3 & A4 --> O
M[(Scoped Memory)] -.-> R & P & E
OB[Observability<br/><i>OpenTelemetry</i>] -.-> R & P & V & E
style R fill:#4A90D9,color:#fff
style P fill:#7B68EE,color:#fff
style V fill:#E74C3C,color:#fff
style E fill:#2ECC71,color:#fff
style M fill:#F39C12,color:#fff
style OB fill:#95A5A6,color:#fff
Pipeline flow: A task enters the Router, which uses a learned contextual bandit policy to select the optimal agent. The Planner decomposes the task into a DAG of sub-goals. The Verifier proves the plan is safe via Z3 constraint solving. The Executor dispatches sub-goals in topological order with parallel execution, retry, and checkpointing. Scoped Memory provides hierarchical state sharing across steps, and OpenTelemetry traces the entire pipeline.
| Feature | Description | |
|---|---|---|
| π― | Learned Routing | Contextual bandit router (epsilon-greedy, UCB, Thompson sampling) that learns optimal agent assignment from task embeddings and execution feedback via an online-trained MLP |
| π§ | Adaptive Planning | Hybrid ReAct + Plan-and-Solve planner with Voyager-style skill caching -- auto-selects strategy based on estimated task complexity |
| π | Formal Verification | Z3-based safety verification checks DAG acyclicity, budget feasibility, capability matching, and custom policies before any execution |
| β‘ | DAG Execution | Topological execution engine with dependency tracking, parallel dispatch, configurable concurrency, retry with exponential backoff, and timeout |
| ποΈ | Scoped Memory | Hierarchical shared memory with namespace inheritance, TTL expiration, semantic similarity search, LRU eviction, and optional Redis persistence |
| π | Token Streaming | Async token-level streaming with backpressure, multi-consumer fan-out, and real-time throughput statistics |
| πΎ | Checkpointing | Execution checkpoints for fault recovery -- resume failed plans from the last successful step |
| π° | Cost Attribution | Per-agent, per-model cost tracking with budget monitoring and alerts via 100+ models through LiteLLM |
| π‘ | OpenTelemetry | Full distributed tracing with Lattice-specific span attributes -- export to Jaeger, Honeycomb, or any OTLP collector |
| βοΈ | Constitutional Critic | Evaluation agent that scores outputs against configurable principles (helpfulness, accuracy, safety, coherence, completeness) with weighted scoring |
| π€ | Human-in-the-Loop | Queue-based human approval agent with timeout, pre-loaded responses for testing, and full audit logging |
Lattice implements ideas from the following research:
| Paper | What Lattice Uses |
|---|---|
| Voyager (Wang et al., 2023) | Skill library pattern: successful plans are cached and reused for similar future tasks via the memory system |
| Plan-and-Solve (Wang et al., 2023) | High-level task decomposition into ordered sub-goals with dependency edges, forming a DAG for the executor |
| ReAct (Yao et al., 2023) | Interleaved thought-action-observation loops for tool-using agents and fine-grained execution planning |
| Contextual Bandits (Agarwal et al., 2014) | Learned routing via reward prediction on task embeddings with epsilon-greedy, UCB, and Thompson exploration |
| Constitutional AI (Bai et al., 2022) | Critic agent that evaluates outputs against configurable principles with weighted scoring and revision suggestions |
pip install latticeOr install from source with dev dependencies:
git clone https://github.com/JiwaniZakir/lattice.git
cd lattice
pip install -e ".[dev]"import asyncio
import numpy as np
from lattice import Router, Planner, Verifier, Executor
from lattice.agents.base import AgentResult, BaseAgent
# Define a custom agent
class MyAgent(BaseAgent):
async def execute(self, task, context=None):
return AgentResult(
agent_id=self.agent_id,
task=task,
output=f"Completed: {task}",
)
async def main():
agent = MyAgent(agent_id="my_agent", name="My Agent")
# 1. Route -- contextual bandit selects the best agent
router = Router(agents=[agent], embedding_dim=384)
embedding = np.random.randn(384).astype(np.float32)
decision = await router.route(embedding)
# 2. Plan -- decompose task into sub-goals
planner = Planner()
plan = await planner.plan(
task="Analyze the quarterly report",
available_agents=[decision.agent_id],
)
# 3. Verify -- Z3 proves the plan is safe
verifier = Verifier()
check = await verifier.verify_plan(plan)
assert check.is_safe
# 4. Execute -- DAG engine runs sub-goals with parallelism
executor = Executor(agents={"my_agent": agent})
result = await executor.execute(plan)
print(result.status, result.total_cost_usd)
asyncio.run(main())src/lattice/
βββ core/ # Router, Planner, Executor, Verifier, Memory
β βββ router.py # Contextual bandit routing with MLP reward predictor
β βββ planner.py # Hybrid ReAct + Plan-and-Solve with skill caching
β βββ verifier.py # Z3-based safety verification engine
β βββ executor.py # DAG execution with concurrency and retry
β βββ memory.py # Scoped hierarchical memory with TTL and Redis
βββ agents/ # Agent implementations
β βββ base.py # BaseAgent ABC + AgentResult + streaming protocol
β βββ tool.py # ReAct tool-using agent with tool registry
β βββ critic.py # Constitutional AI evaluation agent
β βββ human.py # Human-in-the-loop approval agent
βββ routing/ # Routing subsystem
β βββ classifier.py # Task type classification
β βββ embedder.py # Task embedding generation
β βββ feedback.py # Reward feedback processing
βββ verification/ # Verification subsystem
β βββ invariants.py # Safety invariant definitions
β βββ solver.py # Z3 solver wrapper
β βββ policies.py # Custom constraint policies
βββ execution/ # Execution subsystem
β βββ dag.py # DAG builder and topological sort
β βββ streaming.py # Token-level async streaming with fan-out
β βββ checkpointing.py # Checkpoint save/restore for fault recovery
βββ observability/ # Observability subsystem
β βββ tracing.py # OpenTelemetry span management
β βββ metrics.py # Metrics collection
β βββ logging.py # Structured logging via structlog
βββ integrations/ # LLM provider integrations
βββ openai.py # OpenAI provider
βββ anthropic.py # Anthropic provider
βββ litellm.py # LiteLLM unified provider (100+ models)
Measured on Apple M2 Pro, Python 3.12, single process:
| Benchmark | Value |
|---|---|
| Router decision latency (p50) | ~50 Β΅s |
| Router decision latency (p95) | ~120 Β΅s |
| Reward convergence (500 rounds) | >0.85 mean |
| DAG execution (10 parallel steps) | ~15 ms |
| DAG execution (20 sequential steps) | ~45 ms |
| Memory set/get (in-process) | ~5 Β΅s |
| Z3 verification (3-step plan) | ~2 ms |
Run benchmarks yourself:
python benchmarks/routing_benchmark.py
python benchmarks/throughput_benchmark.py| Example | Description |
|---|---|
quickstart.py |
Full pipeline walkthrough: agent creation, routing, planning, verification, execution |
multi_agent_research.py |
Multi-agent DAG with critic evaluation and cost tracking |
verified_workflow.py |
Z3 verification demos: DAG safety, budget proofs, capability matching |
# Run the quickstart example
python examples/quickstart.py# Install dev dependencies
pip install -e ".[dev]"
# Install pre-commit hooks (run once after cloning)
pre-commit install
# Run tests
pytest tests/ -v
# Run linter
ruff check src/ tests/
# Run type checker
mypy src/lattice/
# Run a single test file
pytest tests/test_router.py -vPre-commit hooks run automatically on git commit and enforce ruff check --fix, ruff format, trailing-whitespace cleanup, end-of-file normalization, YAML validity, and large-file detection.
Contributions are welcome! Here is how to get started:
- Fork the repository
- Create a feature branch (
git checkout -b feature/my-feature) - Write tests for your changes
- Ensure all checks pass (
pytest,ruff,mypy) - Submit a pull request
Please open an issue first for major changes so we can discuss the approach.
This project is licensed under the MIT License.
Built with research. Verified with proofs. Orchestrated at scale.