There is a kind of building that does not require recognition.
That gives tools to strangers who will never know your name,
because the alternative — not building — is worse.
You ship an agent. It works in your notebook. Then reality happens:
| Failure Mode | What Goes Wrong |
|---|---|
| 🔀 Routing costs | Every query hits the expensive model — you pay GPT-4 prices for "what's the weather" |
| 🧠 No memory | Each turn starts from zero — your agent forgets what it said three messages ago |
| ♾️ Infinite loops | Tool calls chain forever, cost explodes, timeouts crash the whole pipeline |
| ❌ No evaluation | You have no idea if the agent is actually doing its job |
| 💥 Bad output | Responses fail schema validation, leak PII, or need 5 retries before they're usable |
| 👁️ Blind runs | Production breaks and you have zero traces, zero spans, zero idea what happened |
| 💸 No budgets | Token costs spiral — no per-run caps, no cost ledger, no circuit breakers |
| 🔄 No recovery | A crash mid-run loses all state — no checkpoint, no resume, no rollback |
| 🗳️ Single agent bias | One model, one opinion — no consensus, no majority vote, no cross-check |
| 🌊 No flow control | Queue overflows, queues block, nothing back-pressures, nothing throttles |
arsenal is the fix. 86 surgically targeted libraries. Zero lock-in. Zero external dependencies.
Route → Budget → Guard → Remember → Compress → Observe → Evaluate → Validate
→ Retry → State → Schema → Pipeline → Cache → Config → Events → Health
→ RateLimit → CircuitBreak → Stream → Benchmark → Router → Notifier → Pool
→ Discovery → Checkpoint → Queue → Planner → Limiter → Sandbox → Session
→ Audit → Consensus → Fallback → Throttle → Workflow → Trace → Saga
→ Signal → Lock → Telemetry → Serializer → Hook → Filter → Command
Each step in this lifecycle is a library. Each library does one thing. None of them know the others exist.
| Library | What it does | Install |
|---|---|---|
| herald | Semantic routing without a single token — directs queries to the right model using embeddings | pip install herald |
| agent-router | Runtime dispatch — routes agent actions to the correct handler | pip install agent-router |
| agent-dispatcher | Fan-out routing — broadcasts tasks to multiple downstream agents | pip install agent-dispatcher |
| agent-classifier | Input classification — labels incoming queries before routing | pip install agent-classifier |
| agent-selector | Candidate selection — picks the best model, tool, or handler from a pool | pip install agent-selector |
| agent-balancer | Load balancing for LLM endpoints — RoundRobin, WeightedRandom, LeastConnections with health tracking and automatic failover | pip install agent-balancer |
| Library | What it does | Install |
|---|---|---|
| engram | Short-term + episodic memory — context windows that actually persist across turns | pip install engram |
| agent-context | Context management — builds, trims, and injects context into prompts | pip install agent-context |
| agent-context-window | Window sizing — manages token limits and context eviction strategies | pip install agent-context-window |
| agent-state | Agent state machine — models agent lifecycle as explicit state transitions | pip install agent-state |
| agent-store | Key-value state storage — persistent agent memory with TTL and namespacing | pip install agent-store |
| agent-kv | In-process key-value store — fast ephemeral state for single-run agents | pip install agent-kv |
| agent-cache | Response caching — caches LLM outputs by prompt hash to cut costs | pip install agent-cache |
| agent-snapshot | State snapshots — point-in-time captures of full agent state | pip install agent-snapshot |
| Library | What it does | Install |
|---|---|---|
| sentinel | Loop detection, cost caps, timeout guards — stops runaway agents before they bankrupt you | pip install sentinel |
| agent-guard | Generic guardrails — pre/post-execution safety checks for any agent action | pip install agent-guard |
| agent-guardrails | Output safety — schema enforcement, PII redaction, retry logic | pip install agent-guardrails |
| agent-sandbox | Restricted code execution — runs agent-generated code in a confined namespace | pip install agent-sandbox |
| agent-rate-limiter | Rate limiting — token-bucket and sliding-window rate limits per agent or API key | pip install agent-rate-limiter |
| agent-circuit-breaker | Circuit breaking — opens on repeated failures, recovers gracefully | pip install agent-circuit-breaker |
| agent-policy | Policy enforcement — declarative rules for what agents can and cannot do | pip install agent-policy |
| agent-semaphore | Concurrency limits — caps parallel agent executions to prevent overload | pip install agent-semaphore |
| Library | What it does | Install |
|---|---|---|
| verdict | 3D evaluation — task completion, reasoning quality, tool-use correctness in one call | pip install verdict |
| agent-scorer | Scoring pipelines — pluggable scoring criteria for any agent output | pip install agent-scorer |
| agent-benchmark | Performance benchmarking — latency, throughput, and accuracy tracking | pip install agent-benchmark |
| agent-specs | Specification testing — test agents against behavioral specs, not just outputs | pip install agent-specs |
| agent-profiler | Profiling — identifies bottlenecks in agent execution paths | pip install agent-profiler |
| Library | What it does | Install |
|---|---|---|
| agent-validator | Output validation — validates LLM responses against typed schemas | pip install agent-validator |
| agent-schema | Schema enforcement — defines and enforces structured output contracts | pip install agent-schema |
| agent-formatter | Output formatting — normalizes agent responses to consistent formats | pip install agent-formatter |
| agent-template | Prompt templates — typed, composable templates with variable injection | pip install agent-template |
| agent-serializer | Serialization — converts agent state and outputs to/from wire formats | pip install agent-serializer |
| agent-converter | Type conversion — transforms data between agent-compatible formats | pip install agent-converter |
| Library | What it does | Install |
|---|---|---|
| agent-observability | Full-pipeline tracing — spans, latency, token cost, JSONL export for every run | pip install agent-observability |
| agent-observer | Event observation — subscribes to agent lifecycle events without modifying them | pip install agent-observer |
| agent-trace | Distributed tracing — propagates trace context across agent boundaries | pip install agent-trace |
| agent-tracer | Span tracking — creates, manages, and exports spans in any tracing format | pip install agent-tracer |
| agent-telemetry | Counter/Gauge/Histogram with percentile tracking — production metrics for agents | pip install agent-telemetry |
| agent-log | Structured logging — JSON-first logging with trace correlation | pip install agent-log |
| agent-logger | Log management — routing, filtering, and formatting agent log streams | pip install agent-logger |
| Library | What it does | Install |
|---|---|---|
| agent-checkpoint | Save state mid-run, resume from the last good step — no restarting from scratch | pip install agent-checkpoint |
| agent-saga | Distributed rollback with compensating actions — handles partial failures gracefully | pip install agent-saga |
| agent-retry | Retry policies — exponential backoff, jitter, max-attempts, per-exception rules | pip install agent-retry |
| agent-fallback | Graceful degradation — defines fallback chains when primary agents fail | pip install agent-fallback |
| Library | What it does | Install |
|---|---|---|
| agent-consensus | Weighted majority vote across multiple agents — resolves disagreements systematically | pip install agent-consensus |
| agent-coordinator | Multi-agent coordination — orchestrates agent roles, dependencies, and handoffs | pip install agent-coordinator |
| agent-aggregator | Result aggregation — merges outputs from parallel agents into a single result | pip install agent-aggregator |
| agent-reducer | Output reduction — folds multiple agent responses into a minimal representation | pip install agent-reducer |
| Library | What it does | Install |
|---|---|---|
| agent-planner | Task planning — decomposes goals into executable step sequences | pip install agent-planner |
| agent-workflow | Workflow orchestration — defines multi-step agent workflows with branching | pip install agent-workflow |
| agent-pipeline | Pipeline construction — chains agents into composable processing pipelines | pip install agent-pipeline |
| agent-flow | Control flow — conditional branching, loops, and parallel execution for agents | pip install agent-flow |
| agent-scheduler | Task scheduling — time-based and trigger-based agent execution | pip install agent-scheduler |
| agent-queue | Task queuing — priority queues with backpressure for agent workloads | pip install agent-queue |
| Library | What it does | Install |
|---|---|---|
| agent-budget | Token + cost budgets per run — stops agents when spend limits are hit | pip install agent-budget |
| agent-limiter | Token budget + cost budget + rate limiting in one — the three-in-one spend enforcer | pip install agent-limiter |
| agent-throttle | Request throttling — smooths bursts to protect downstream APIs | pip install agent-throttle |
| agent-ledger | Cost ledger — tracks token spend per agent, per session, per user | pip install agent-ledger |
| agent-audit | Spend auditing — immutable log of every cost event for compliance and debugging | pip install agent-audit |
| Library | What it does | Install |
|---|---|---|
| agent-events | Event bus — pub/sub for agent lifecycle and state events | pip install agent-events |
| agent-event-sourcing | Event sourcing — rebuilds agent state from an append-only event log | pip install agent-event-sourcing |
| agent-signal | Signal handling — async signals for cross-agent communication | pip install agent-signal |
| agent-hook | Lifecycle hooks — pre/post hooks for any agent execution step | pip install agent-hook |
| agent-notifier | Notifications — fires alerts on agent failures, thresholds, or completions | pip install agent-notifier |
| agent-pubsub | Publish-subscribe messaging — decoupled topic-based communication between agents | pip install agent-pubsub |
| Library | What it does | Install |
|---|---|---|
| agent-session | Session management — scopes state, memory, and cost to a single user interaction | pip install agent-session |
| agent-health | Health checks — liveness and readiness probes for agent services | pip install agent-health |
| agent-watchdog | Crash recovery — monitors agents and restarts on unexpected failures | pip install agent-watchdog |
| agent-timer | Time management — deadlines, timeouts, and elapsed-time tracking | pip install agent-timer |
| agent-lock | Distributed locking — prevents concurrent agents from corrupting shared state | pip install agent-lock |
| Library | What it does | Install |
|---|---|---|
| agent-mapper | Data mapping — transforms agent I/O between different schemas | pip install agent-mapper |
| agent-extractor | Extraction — pulls structured data out of unstructured LLM outputs | pip install agent-extractor |
| agent-digest | Summarization and hashing — condenses long outputs and fingerprints content | pip install agent-digest |
| agent-tokenizer | Token counting — accurate token estimation across models without calling the API | pip install agent-tokenizer |
| agent-metadata | Metadata management — attaches and propagates structured metadata through pipelines | pip install agent-metadata |
| agent-filter | Data filtering — applies inclusion/exclusion rules to agent inputs and outputs | pip install agent-filter |
| agent-sampler | Data sampling — draws representative subsets from large agent datasets | pip install agent-sampler |
| agent-compress | Context compression — shrinks large agent payloads while preserving semantic content | pip install agent-compress |
| Library | What it does | Install |
|---|---|---|
| agent-config | Configuration management — typed config loading with environment overrides | pip install agent-config |
| agent-secrets | Secret management — secure loading and rotation of API keys and credentials | pip install agent-secrets |
| agent-plugin | Plugin system — extensible registry for adding capabilities to agents | pip install agent-plugin |
| agent-discovery | Service discovery — locates agents and tools dynamically at runtime | pip install agent-discovery |
| agent-pool | Resource pooling — manages reusable agent instances to cut cold-start overhead | pip install agent-pool |
| agent-dependency | Dependency management — explicit dependency graphs between agent components | pip install agent-dependency |
| agent-tools | Tool registry — central registry for all tools available to an agent | pip install agent-tools |
| Library | What it does | Install |
|---|---|---|
| agent-stream | Streaming responses — handles SSE and chunked streaming from LLMs | pip install agent-stream |
| agent-command | Command pattern — encapsulates agent actions as undoable, inspectable commands | pip install agent-command |
| Library | What it does | Install |
|---|---|---|
| agent-mock | Testing mocks — drop-in mocks for LLM APIs, tools, and agent components | pip install agent-mock |
# Just routing — nothing else
from herald import Router
router = Router()
router.add("gpt-4", description="complex reasoning, analysis")
router.add("gpt-3.5-turbo", description="simple queries, fast responses")
result = router.route("what is 2 + 2?") # → gpt-3.5-turbo, 0 tokens used
# Just memory — nothing else
from engram import Memory
mem = Memory()
mem.store("user said X")
context = mem.recall("what did user say?")
# Just guards — nothing else
from sentinel import Guard
guard = Guard(max_loops=5, max_cost_usd=0.10)
guard.check(loop_count=3, cost_usd=0.04) # ok
guard.check(loop_count=6, cost_usd=0.04) # raises LoopLimitError
# Just evaluation — nothing else
from verdict import Evaluator
score = Evaluator().evaluate(task="summarize this", response=llm_output)
print(score.task, score.reasoning, score.tool_use)
# Just checkpointing — nothing else
from agent_checkpoint import Checkpoint
cp = Checkpoint("my-agent-run")
cp.save(step=3, state={"messages": [...], "cost": 0.04})
# crash happens
state = cp.resume() # picks up from step 3
# Just consensus — nothing else
from agent_consensus import Consensus
vote = Consensus(weights={"gpt-4": 0.6, "claude": 0.4})
result = vote.decide([gpt4_output, claude_output])
# Just telemetry — nothing else
from agent_telemetry import Telemetry
t = Telemetry()
t.counter("tool_calls").inc()
t.gauge("token_usage").set(1500)
t.histogram("latency_ms").observe(340)
print(t.histogram("latency_ms").percentile(0.95))There is no arsenal object to import. No base class to inherit. No plugin registry to configure. Just Python.
Install individually — use only what you need:
# The originals
pip install herald # semantic routing, 0 tokens
pip install engram # agent memory
pip install sentinel # loop + cost guards
pip install verdict # 3D evaluation
# The full production stack
pip install agent-checkpoint agent-saga agent-retry agent-fallback
pip install agent-consensus agent-coordinator agent-aggregator
pip install agent-telemetry agent-trace agent-tracer agent-observability
pip install agent-limiter agent-budget agent-throttle agent-ledger
pip install agent-sandbox agent-guard agent-guardrails agent-policy
pip install agent-planner agent-workflow agent-pipeline agent-queue
pip install agent-session agent-health agent-watchdog agent-lock
pip install agent-validator agent-schema agent-serializer agent-formatter
pip install agent-events agent-signal agent-hook agent-notifier
pip install agent-config agent-secrets agent-discovery agent-pool
pip install agent-stream agent-command agent-tokenizer agent-cacheOr install from source:
pip install git+https://github.com/darshjme/<library-name>.gitarsenal is not a framework.
- It does not provide an
Agentbase class you inherit from - It does not have an opinionated execution loop
- It does not require you to use all of it — or any particular combination
- It is not a LangChain replacement. LangChain is an abstraction layer. arsenal is a toolbox.
- It is not LlamaIndex. It does not do RAG.
- It is not AutoGen. It does not define agent communication protocols.
arsenal is what you reach for when your framework starts failing you.
When your LangChain agent loops forever — sentinel.
When you have no idea why it's expensive — agent-ledger, agent-telemetry.
When it crashes mid-run and loses everything — agent-checkpoint.
When you can't trust its output — verdict, agent-validator.
Use your framework. Just don't let it be your only line of defence.
Most open-source tooling gets built for recognition. Stars, followers, a name to carry forward.
This isn't that.
These 100 libraries were built in the quiet — not because the builder needed to be seen, but because the tools needed to exist. For developers who will run this code without ever meeting the person who wrote it. For systems that will never send acknowledgment. For a future that doesn't yet know it was prepared for.
That is the only worthwhile reason to build anything: because the thing needs to exist, and you can make it exist.
The world doesn't always reward builders. It loses their names. It forgets who shipped the thing that saved the sprint. It takes tools and moves on without looking back. A builder who has understood this — and builds anyway — is building from a different place than ambition. Something quieter. Something that doesn't need applause to keep running.
Each library here is a single, complete act. Routing. Memory. Guarding. Checkpointing. Given unconditionally to whoever needs it, whatever they're building, whether they ever discover who wrote it or not.
That's not humility. It's a different kind of confidence — one that doesn't require your name on the outcome.
Arsenal is the reliability layer under A2A agents.
Google's Agent2Agent (A2A) Protocol defines how agents discover each other (Agent Cards), communicate (JSON-RPC 2.0), and stream results (SSE). What it doesn't define is what happens when the remote agent is down, slow, or rate-limiting you back. That's Arsenal's job.
Every A2A call — tasks/send, tasks/get, SSE stream subscriptions — is a network call that can fail. Arsenal gives you the five production primitives you need to make those calls bulletproof:
| Arsenal lib | Vedic name | What it does for A2A |
|---|---|---|
agent-circuit-breaker |
kavacha | Open the circuit after N failed A2A calls; fail fast until the remote agent recovers |
agent-retry |
punarjanma | Exponential backoff + jitter on transient A2A errors (429, 503, network blips) |
agent-tracer |
anusarana | Trace every A2A tasks/send call end-to-end with span IDs for debugging |
agent-limiter |
maryada | Rate-limit your outbound A2A calls so you don't overwhelm downstream agents |
agent-session |
sanga | Persist A2A session context across multi-turn task exchanges with TTL |
import httpx
from agent_circuit_breaker import CircuitBreaker, CircuitOpenError, ProtectedCaller
# 1. Define the A2A call (raw JSON-RPC 2.0 over HTTP)
def send_a2a_task(agent_url: str, task_payload: dict) -> dict:
"""Send a task to a remote A2A agent endpoint."""
response = httpx.post(
f"{agent_url}/",
json={
"jsonrpc": "2.0",
"method": "tasks/send",
"params": task_payload,
"id": task_payload.get("id", 1),
},
timeout=30.0,
)
response.raise_for_status()
return response.json()
# 2. Protect it with kavacha — open after 3 failures, recover after 60 s
breaker = CircuitBreaker(
name="a2a-research-agent",
failure_threshold=3,
recovery_timeout_seconds=60.0,
success_threshold=2,
)
protected_send = ProtectedCaller(send_a2a_task, breaker)
# 3. Wrap with punarjanma for retry on transient errors
from agent_retry import RetryExecutor, RetryPolicy
retry_executor = RetryExecutor(
RetryPolicy(max_attempts=3, base_delay=1.0, jitter=True)
)
# 4. Wrap with anusarana for end-to-end tracing
from agent_tracer import Tracer
tracer = Tracer(service="orchestrator-agent")
# 5. Full production-grade A2A call
RESEARCH_AGENT_URL = "https://research-agent.example.com/a2a"
def call_research_agent(query: str, session_id: str) -> dict:
with tracer.span("a2a.tasks/send", tags={"agent": "research-agent"}) as span:
try:
payload = {
"id": session_id,
"message": {
"role": "user",
"parts": [{"type": "text", "text": query}],
},
}
# circuit-breaker wraps the retry-wrapped call
result = retry_executor.execute(
lambda: protected_send(RESEARCH_AGENT_URL, payload)
)
span.set_tag("status", "ok")
return result
except CircuitOpenError as e:
span.set_tag("status", "circuit_open")
span.set_tag("retry_after", e.retry_after)
raise # surface to caller — don't hide outages
except Exception as e:
span.set_tag("status", "error")
span.set_tag("error", str(e))
raise
# Usage
result = call_research_agent(
query="Summarize Q1 2025 earnings for NVDA",
session_id="session-abc123",
)
print(result["result"]["status"]) # completed / working / failedWhat this gives you:
- ✅ Circuit breaker: if
research-agentis down, you fail fast after 3 attempts instead of hammering it - ✅ Retry with jitter: transient 429s or network blips are handled automatically
- ✅ Distributed tracing: every A2A hop has a span ID — debug multi-agent pipelines in seconds
- ✅ Zero external dependencies: the entire stack is pure Python, ships in any container
A2A standardises how agents talk. Arsenal standardises how those conversations survive the real world.
┌─────────────────────────────────────────┐
│ Your Orchestrator Agent │
│ │
│ ┌─────────┐ ┌──────────┐ ┌────────┐ │
│ │ kavacha │ │punarjanma│ │anusara.│ │
│ │circuit │ │ retry │ │ trace │ │
│ │breaker │ │ backoff │ │ span │ │
│ └────┬────┘ └────┬─────┘ └───┬────┘ │
│ └────────────┴────────────┘ │
│ A2A JSON-RPC 2.0 │
└─────────────────────────────────────────┘
│ │
┌─────────▼──┐ ┌──────▼──────┐
│ Agent Card │ │ Agent Card │
│ tasks/send │ │ tasks/get │
│ SSE stream │ │ SSE stream │
└────────────┘ └─────────────┘
Remote Agent A Remote Agent B
The A2A protocol spec is 0 lines of reliability code. That's by design — protocol specs shouldn't enforce runtime behaviour. Arsenal fills that gap: 100 libraries, 4,375 tests, zero dependencies. Drop it into any A2A-based system in minutes.
pip install agent-circuit-breaker agent-retry agent-tracer agent-limiter agent-session
# or with Vedic names (same packages, new PyPI names in v2)
pip install kavacha punarjanma anusarana maryada sangaPart of the Vedic Arsenal — 100 production-grade Python libraries for LLM agent reliability, each named from the Vedas, Puranas, and Mahakavyas.
कर्मण्येवाधिकारस्ते मा फलेषु कदाचन
You have a right to your work, not to the fruits of your work.
Build what needs building. Ship it correctly.
Most agent frameworks promise to handle everything — and then fail quietly in production when routing misfires, memory runs out, loops spin forever, and you have no idea why. arsenal is the opposite. 86 small, focused libraries, each doing one thing extremely well. They do not hide the LLM from you. They do not abstract away your control. They give you sharp instruments and get out of the way.
Your agents do the work. arsenal makes sure they do it without failure.
| Project | What it is |
|---|---|
| a2a-reliability-starter | Production-ready reliability layer for Google A2A Protocol agents — uses Arsenal patterns (kavacha, punarjanma, anusarana, maryada, sanga) |
| llm-reliability-starter | Reliability monitoring starter for any LLM pipeline — circuit breaker, evaluator, monitor |
Darshankumar Joshi — Gujarat, India.
Building production-grade LLM infrastructure, in silence, for everyone.
4375 tests · 100 libraries · zero external dependencies · MIT licensed · production-tested
Use one. Use all. They compose.