arsenal

tools built in silence, for everyone who builds AI

There is a kind of building that does not require recognition.
That gives tools to strangers who will never know your name,
because the alternative — not building — is worse.

Why Agents Fail in Production

You ship an agent. It works in your notebook. Then reality happens:

Failure Mode	What Goes Wrong
🔀 Routing costs	Every query hits the expensive model — you pay GPT-4 prices for "what's the weather"
🧠 No memory	Each turn starts from zero — your agent forgets what it said three messages ago
♾️ Infinite loops	Tool calls chain forever, cost explodes, timeouts crash the whole pipeline
❌ No evaluation	You have no idea if the agent is actually doing its job
💥 Bad output	Responses fail schema validation, leak PII, or need 5 retries before they're usable
👁️ Blind runs	Production breaks and you have zero traces, zero spans, zero idea what happened
💸 No budgets	Token costs spiral — no per-run caps, no cost ledger, no circuit breakers
🔄 No recovery	A crash mid-run loses all state — no checkpoint, no resume, no rollback
🗳️ Single agent bias	One model, one opinion — no consensus, no majority vote, no cross-check
🌊 No flow control	Queue overflows, queues block, nothing back-pressures, nothing throttles

arsenal is the fix. 86 surgically targeted libraries. Zero lock-in. Zero external dependencies.

The Production Lifecycle

Route → Budget → Guard → Remember → Compress → Observe → Evaluate → Validate
→ Retry → State → Schema → Pipeline → Cache → Config → Events → Health
→ RateLimit → CircuitBreak → Stream → Benchmark → Router → Notifier → Pool
→ Discovery → Checkpoint → Queue → Planner → Limiter → Sandbox → Session
→ Audit → Consensus → Fallback → Throttle → Workflow → Trace → Saga
→ Signal → Lock → Telemetry → Serializer → Hook → Filter → Command

Each step in this lifecycle is a library. Each library does one thing. None of them know the others exist.

Libraries

🔀 Routing & Dispatch

Library	What it does	Install
herald	Semantic routing without a single token — directs queries to the right model using embeddings	`pip install herald`
agent-router	Runtime dispatch — routes agent actions to the correct handler	`pip install agent-router`
agent-dispatcher	Fan-out routing — broadcasts tasks to multiple downstream agents	`pip install agent-dispatcher`
agent-classifier	Input classification — labels incoming queries before routing	`pip install agent-classifier`
agent-selector	Candidate selection — picks the best model, tool, or handler from a pool	`pip install agent-selector`
agent-balancer	Load balancing for LLM endpoints — RoundRobin, WeightedRandom, LeastConnections with health tracking and automatic failover	`pip install agent-balancer`

🧠 Memory & Context

Library	What it does	Install
engram	Short-term + episodic memory — context windows that actually persist across turns	`pip install engram`
agent-context	Context management — builds, trims, and injects context into prompts	`pip install agent-context`
agent-context-window	Window sizing — manages token limits and context eviction strategies	`pip install agent-context-window`
agent-state	Agent state machine — models agent lifecycle as explicit state transitions	`pip install agent-state`
agent-store	Key-value state storage — persistent agent memory with TTL and namespacing	`pip install agent-store`
agent-kv	In-process key-value store — fast ephemeral state for single-run agents	`pip install agent-kv`
agent-cache	Response caching — caches LLM outputs by prompt hash to cut costs	`pip install agent-cache`
agent-snapshot	State snapshots — point-in-time captures of full agent state	`pip install agent-snapshot`

🛡️ Guarding & Safety

Library	What it does	Install
sentinel	Loop detection, cost caps, timeout guards — stops runaway agents before they bankrupt you	`pip install sentinel`
agent-guard	Generic guardrails — pre/post-execution safety checks for any agent action	`pip install agent-guard`
agent-guardrails	Output safety — schema enforcement, PII redaction, retry logic	`pip install agent-guardrails`
agent-sandbox	Restricted code execution — runs agent-generated code in a confined namespace	`pip install agent-sandbox`
agent-rate-limiter	Rate limiting — token-bucket and sliding-window rate limits per agent or API key	`pip install agent-rate-limiter`
agent-circuit-breaker	Circuit breaking — opens on repeated failures, recovers gracefully	`pip install agent-circuit-breaker`
agent-policy	Policy enforcement — declarative rules for what agents can and cannot do	`pip install agent-policy`
agent-semaphore	Concurrency limits — caps parallel agent executions to prevent overload	`pip install agent-semaphore`

📊 Evaluation & Benchmarking

Library	What it does	Install
verdict	3D evaluation — task completion, reasoning quality, tool-use correctness in one call	`pip install verdict`
agent-scorer	Scoring pipelines — pluggable scoring criteria for any agent output	`pip install agent-scorer`
agent-benchmark	Performance benchmarking — latency, throughput, and accuracy tracking	`pip install agent-benchmark`
agent-specs	Specification testing — test agents against behavioral specs, not just outputs	`pip install agent-specs`
agent-profiler	Profiling — identifies bottlenecks in agent execution paths	`pip install agent-profiler`

✅ Validation & Schema

Library	What it does	Install
agent-validator	Output validation — validates LLM responses against typed schemas	`pip install agent-validator`
agent-schema	Schema enforcement — defines and enforces structured output contracts	`pip install agent-schema`
agent-formatter	Output formatting — normalizes agent responses to consistent formats	`pip install agent-formatter`
agent-template	Prompt templates — typed, composable templates with variable injection	`pip install agent-template`
agent-serializer	Serialization — converts agent state and outputs to/from wire formats	`pip install agent-serializer`
agent-converter	Type conversion — transforms data between agent-compatible formats	`pip install agent-converter`

👁️ Observability & Tracing

Library	What it does	Install
agent-observability	Full-pipeline tracing — spans, latency, token cost, JSONL export for every run	`pip install agent-observability`
agent-observer	Event observation — subscribes to agent lifecycle events without modifying them	`pip install agent-observer`
agent-trace	Distributed tracing — propagates trace context across agent boundaries	`pip install agent-trace`
agent-tracer	Span tracking — creates, manages, and exports spans in any tracing format	`pip install agent-tracer`
agent-telemetry	Counter/Gauge/Histogram with percentile tracking — production metrics for agents	`pip install agent-telemetry`
agent-log	Structured logging — JSON-first logging with trace correlation	`pip install agent-log`
agent-logger	Log management — routing, filtering, and formatting agent log streams	`pip install agent-logger`

🔁 Checkpointing & Recovery

Library	What it does	Install
agent-checkpoint	Save state mid-run, resume from the last good step — no restarting from scratch	`pip install agent-checkpoint`
agent-saga	Distributed rollback with compensating actions — handles partial failures gracefully	`pip install agent-saga`
agent-retry	Retry policies — exponential backoff, jitter, max-attempts, per-exception rules	`pip install agent-retry`
agent-fallback	Graceful degradation — defines fallback chains when primary agents fail	`pip install agent-fallback`

🗳️ Consensus & Coordination

Library	What it does	Install
agent-consensus	Weighted majority vote across multiple agents — resolves disagreements systematically	`pip install agent-consensus`
agent-coordinator	Multi-agent coordination — orchestrates agent roles, dependencies, and handoffs	`pip install agent-coordinator`
agent-aggregator	Result aggregation — merges outputs from parallel agents into a single result	`pip install agent-aggregator`
agent-reducer	Output reduction — folds multiple agent responses into a minimal representation	`pip install agent-reducer`

📋 Planning & Workflow

Library	What it does	Install
agent-planner	Task planning — decomposes goals into executable step sequences	`pip install agent-planner`
agent-workflow	Workflow orchestration — defines multi-step agent workflows with branching	`pip install agent-workflow`
agent-pipeline	Pipeline construction — chains agents into composable processing pipelines	`pip install agent-pipeline`
agent-flow	Control flow — conditional branching, loops, and parallel execution for agents	`pip install agent-flow`
agent-scheduler	Task scheduling — time-based and trigger-based agent execution	`pip install agent-scheduler`
agent-queue	Task queuing — priority queues with backpressure for agent workloads	`pip install agent-queue`

💸 Budgeting & Throttling

Library	What it does	Install
agent-budget	Token + cost budgets per run — stops agents when spend limits are hit	`pip install agent-budget`
agent-limiter	Token budget + cost budget + rate limiting in one — the three-in-one spend enforcer	`pip install agent-limiter`
agent-throttle	Request throttling — smooths bursts to protect downstream APIs	`pip install agent-throttle`
agent-ledger	Cost ledger — tracks token spend per agent, per session, per user	`pip install agent-ledger`
agent-audit	Spend auditing — immutable log of every cost event for compliance and debugging	`pip install agent-audit`

📡 Events & Signals

Library	What it does	Install
agent-events	Event bus — pub/sub for agent lifecycle and state events	`pip install agent-events`
agent-event-sourcing	Event sourcing — rebuilds agent state from an append-only event log	`pip install agent-event-sourcing`
agent-signal	Signal handling — async signals for cross-agent communication	`pip install agent-signal`
agent-hook	Lifecycle hooks — pre/post hooks for any agent execution step	`pip install agent-hook`
agent-notifier	Notifications — fires alerts on agent failures, thresholds, or completions	`pip install agent-notifier`
agent-pubsub	Publish-subscribe messaging — decoupled topic-based communication between agents	`pip install agent-pubsub`

🏥 Session & Lifecycle

Library	What it does	Install
agent-session	Session management — scopes state, memory, and cost to a single user interaction	`pip install agent-session`
agent-health	Health checks — liveness and readiness probes for agent services	`pip install agent-health`
agent-watchdog	Crash recovery — monitors agents and restarts on unexpected failures	`pip install agent-watchdog`
agent-timer	Time management — deadlines, timeouts, and elapsed-time tracking	`pip install agent-timer`
agent-lock	Distributed locking — prevents concurrent agents from corrupting shared state	`pip install agent-lock`

🔄 Data & Transformation

Library	What it does	Install
agent-mapper	Data mapping — transforms agent I/O between different schemas	`pip install agent-mapper`
agent-extractor	Extraction — pulls structured data out of unstructured LLM outputs	`pip install agent-extractor`
agent-digest	Summarization and hashing — condenses long outputs and fingerprints content	`pip install agent-digest`
agent-tokenizer	Token counting — accurate token estimation across models without calling the API	`pip install agent-tokenizer`
agent-metadata	Metadata management — attaches and propagates structured metadata through pipelines	`pip install agent-metadata`
agent-filter	Data filtering — applies inclusion/exclusion rules to agent inputs and outputs	`pip install agent-filter`
agent-sampler	Data sampling — draws representative subsets from large agent datasets	`pip install agent-sampler`
agent-compress	Context compression — shrinks large agent payloads while preserving semantic content	`pip install agent-compress`

⚙️ Configuration & Discovery

Library	What it does	Install
agent-config	Configuration management — typed config loading with environment overrides	`pip install agent-config`
agent-secrets	Secret management — secure loading and rotation of API keys and credentials	`pip install agent-secrets`
agent-plugin	Plugin system — extensible registry for adding capabilities to agents	`pip install agent-plugin`
agent-discovery	Service discovery — locates agents and tools dynamically at runtime	`pip install agent-discovery`
agent-pool	Resource pooling — manages reusable agent instances to cut cold-start overhead	`pip install agent-pool`
agent-dependency	Dependency management — explicit dependency graphs between agent components	`pip install agent-dependency`
agent-tools	Tool registry — central registry for all tools available to an agent	`pip install agent-tools`

🌊 Streaming & Commands

Library	What it does	Install
agent-stream	Streaming responses — handles SSE and chunked streaming from LLMs	`pip install agent-stream`
agent-command	Command pattern — encapsulates agent actions as undoable, inspectable commands	`pip install agent-command`

🧪 Testing & Mocking

Library	What it does	Install
agent-mock	Testing mocks — drop-in mocks for LLM APIs, tools, and agent components	`pip install agent-mock`

Quick Start

# Just routing — nothing else
from herald import Router
router = Router()
router.add("gpt-4", description="complex reasoning, analysis")
router.add("gpt-3.5-turbo", description="simple queries, fast responses")
result = router.route("what is 2 + 2?")  # → gpt-3.5-turbo, 0 tokens used

# Just memory — nothing else
from engram import Memory
mem = Memory()
mem.store("user said X")
context = mem.recall("what did user say?")

# Just guards — nothing else
from sentinel import Guard
guard = Guard(max_loops=5, max_cost_usd=0.10)
guard.check(loop_count=3, cost_usd=0.04)  # ok
guard.check(loop_count=6, cost_usd=0.04)  # raises LoopLimitError

# Just evaluation — nothing else
from verdict import Evaluator
score = Evaluator().evaluate(task="summarize this", response=llm_output)
print(score.task, score.reasoning, score.tool_use)

# Just checkpointing — nothing else
from agent_checkpoint import Checkpoint
cp = Checkpoint("my-agent-run")
cp.save(step=3, state={"messages": [...], "cost": 0.04})
# crash happens
state = cp.resume()  # picks up from step 3

# Just consensus — nothing else
from agent_consensus import Consensus
vote = Consensus(weights={"gpt-4": 0.6, "claude": 0.4})
result = vote.decide([gpt4_output, claude_output])

# Just telemetry — nothing else
from agent_telemetry import Telemetry
t = Telemetry()
t.counter("tool_calls").inc()
t.gauge("token_usage").set(1500)
t.histogram("latency_ms").observe(340)
print(t.histogram("latency_ms").percentile(0.95))

There is no arsenal object to import. No base class to inherit. No plugin registry to configure. Just Python.

Install

Install individually — use only what you need:

# The originals
pip install herald           # semantic routing, 0 tokens
pip install engram           # agent memory
pip install sentinel         # loop + cost guards
pip install verdict          # 3D evaluation

# The full production stack
pip install agent-checkpoint agent-saga agent-retry agent-fallback
pip install agent-consensus agent-coordinator agent-aggregator
pip install agent-telemetry agent-trace agent-tracer agent-observability
pip install agent-limiter agent-budget agent-throttle agent-ledger
pip install agent-sandbox agent-guard agent-guardrails agent-policy
pip install agent-planner agent-workflow agent-pipeline agent-queue
pip install agent-session agent-health agent-watchdog agent-lock
pip install agent-validator agent-schema agent-serializer agent-formatter
pip install agent-events agent-signal agent-hook agent-notifier
pip install agent-config agent-secrets agent-discovery agent-pool
pip install agent-stream agent-command agent-tokenizer agent-cache

Or install from source:

pip install git+https://github.com/darshjme/<library-name>.git

What This Isn't

arsenal is not a framework.

It does not provide an Agent base class you inherit from
It does not have an opinionated execution loop
It does not require you to use all of it — or any particular combination
It is not a LangChain replacement. LangChain is an abstraction layer. arsenal is a toolbox.
It is not LlamaIndex. It does not do RAG.
It is not AutoGen. It does not define agent communication protocols.

arsenal is what you reach for when your framework starts failing you.

When your LangChain agent loops forever — sentinel.
When you have no idea why it's expensive — agent-ledger, agent-telemetry.
When it crashes mid-run and loses everything — agent-checkpoint.
When you can't trust its output — verdict, agent-validator.

Use your framework. Just don't let it be your only line of defence.

Why This Exists

Most open-source tooling gets built for recognition. Stars, followers, a name to carry forward.

This isn't that.

These 100 libraries were built in the quiet — not because the builder needed to be seen, but because the tools needed to exist. For developers who will run this code without ever meeting the person who wrote it. For systems that will never send acknowledgment. For a future that doesn't yet know it was prepared for.

That is the only worthwhile reason to build anything: because the thing needs to exist, and you can make it exist.

The world doesn't always reward builders. It loses their names. It forgets who shipped the thing that saved the sprint. It takes tools and moves on without looking back. A builder who has understood this — and builds anyway — is building from a different place than ambition. Something quieter. Something that doesn't need applause to keep running.

Each library here is a single, complete act. Routing. Memory. Guarding. Checkpointing. Given unconditionally to whoever needs it, whatever they're building, whether they ever discover who wrote it or not.

That's not humility. It's a different kind of confidence — one that doesn't require your name on the outcome.

🔗 A2A Protocol Ready

Arsenal is the reliability layer under A2A agents.

Google's Agent2Agent (A2A) Protocol defines how agents discover each other (Agent Cards), communicate (JSON-RPC 2.0), and stream results (SSE). What it doesn't define is what happens when the remote agent is down, slow, or rate-limiting you back. That's Arsenal's job.

Every A2A call — tasks/send, tasks/get, SSE stream subscriptions — is a network call that can fail. Arsenal gives you the five production primitives you need to make those calls bulletproof:

Arsenal lib	Vedic name	What it does for A2A
`agent-circuit-breaker`	kavacha	Open the circuit after N failed A2A calls; fail fast until the remote agent recovers
`agent-retry`	punarjanma	Exponential backoff + jitter on transient A2A errors (429, 503, network blips)
`agent-tracer`	anusarana	Trace every A2A `tasks/send` call end-to-end with span IDs for debugging
`agent-limiter`	maryada	Rate-limit your outbound A2A calls so you don't overwhelm downstream agents
`agent-session`	sanga	Persist A2A session context across multi-turn task exchanges with TTL

Wrapping an A2A Call with kavacha (circuit-breaker)

import httpx
from agent_circuit_breaker import CircuitBreaker, CircuitOpenError, ProtectedCaller

# 1. Define the A2A call (raw JSON-RPC 2.0 over HTTP)
def send_a2a_task(agent_url: str, task_payload: dict) -> dict:
    """Send a task to a remote A2A agent endpoint."""
    response = httpx.post(
        f"{agent_url}/",
        json={
            "jsonrpc": "2.0",
            "method": "tasks/send",
            "params": task_payload,
            "id": task_payload.get("id", 1),
        },
        timeout=30.0,
    )
    response.raise_for_status()
    return response.json()

# 2. Protect it with kavacha — open after 3 failures, recover after 60 s
breaker = CircuitBreaker(
    name="a2a-research-agent",
    failure_threshold=3,
    recovery_timeout_seconds=60.0,
    success_threshold=2,
)
protected_send = ProtectedCaller(send_a2a_task, breaker)

# 3. Wrap with punarjanma for retry on transient errors
from agent_retry import RetryExecutor, RetryPolicy

retry_executor = RetryExecutor(
    RetryPolicy(max_attempts=3, base_delay=1.0, jitter=True)
)

# 4. Wrap with anusarana for end-to-end tracing
from agent_tracer import Tracer

tracer = Tracer(service="orchestrator-agent")

# 5. Full production-grade A2A call
RESEARCH_AGENT_URL = "https://research-agent.example.com/a2a"

def call_research_agent(query: str, session_id: str) -> dict:
    with tracer.span("a2a.tasks/send", tags={"agent": "research-agent"}) as span:
        try:
            payload = {
                "id": session_id,
                "message": {
                    "role": "user",
                    "parts": [{"type": "text", "text": query}],
                },
            }
            # circuit-breaker wraps the retry-wrapped call
            result = retry_executor.execute(
                lambda: protected_send(RESEARCH_AGENT_URL, payload)
            )
            span.set_tag("status", "ok")
            return result
        except CircuitOpenError as e:
            span.set_tag("status", "circuit_open")
            span.set_tag("retry_after", e.retry_after)
            raise  # surface to caller — don't hide outages
        except Exception as e:
            span.set_tag("status", "error")
            span.set_tag("error", str(e))
            raise

# Usage
result = call_research_agent(
    query="Summarize Q1 2025 earnings for NVDA",
    session_id="session-abc123",
)
print(result["result"]["status"])  # completed / working / failed

What this gives you:

✅ Circuit breaker: if research-agent is down, you fail fast after 3 attempts instead of hammering it
✅ Retry with jitter: transient 429s or network blips are handled automatically
✅ Distributed tracing: every A2A hop has a span ID — debug multi-agent pipelines in seconds
✅ Zero external dependencies: the entire stack is pure Python, ships in any container

Why Arsenal is the Reliability Layer Under A2A Agents

A2A standardises how agents talk. Arsenal standardises how those conversations survive the real world.

┌─────────────────────────────────────────┐
│         Your Orchestrator Agent         │
│                                         │
│  ┌─────────┐  ┌──────────┐  ┌────────┐ │
│  │ kavacha │  │punarjanma│  │anusara.│ │
│  │circuit  │  │  retry   │  │ trace  │ │
│  │breaker  │  │ backoff  │  │  span  │ │
│  └────┬────┘  └────┬─────┘  └───┬────┘ │
│       └────────────┴────────────┘       │
│              A2A JSON-RPC 2.0           │
└─────────────────────────────────────────┘
            │                │
  ┌─────────▼──┐      ┌──────▼──────┐
  │ Agent Card │      │ Agent Card  │
  │ tasks/send │      │ tasks/get   │
  │ SSE stream │      │ SSE stream  │
  └────────────┘      └─────────────┘
  Remote Agent A      Remote Agent B

The A2A protocol spec is 0 lines of reliability code. That's by design — protocol specs shouldn't enforce runtime behaviour. Arsenal fills that gap: 100 libraries, 4,375 tests, zero dependencies. Drop it into any A2A-based system in minutes.

Quick install

pip install agent-circuit-breaker agent-retry agent-tracer agent-limiter agent-session
# or with Vedic names (same packages, new PyPI names in v2)
pip install kavacha punarjanma anusarana maryada sanga

Part of the Vedic Arsenal — 100 production-grade Python libraries for LLM agent reliability, each named from the Vedas, Puranas, and Mahakavyas.

Philosophy

कर्मण्येवाधिकारस्ते मा फलेषु कदाचन
You have a right to your work, not to the fruits of your work.

Build what needs building. Ship it correctly.

Most agent frameworks promise to handle everything — and then fail quietly in production when routing misfires, memory runs out, loops spin forever, and you have no idea why. arsenal is the opposite. 86 small, focused libraries, each doing one thing extremely well. They do not hide the LLM from you. They do not abstract away your control. They give you sharp instruments and get out of the way.

Your agents do the work. arsenal makes sure they do it without failure.

Related Projects

Project	What it is
a2a-reliability-starter	Production-ready reliability layer for Google A2A Protocol agents — uses Arsenal patterns (kavacha, punarjanma, anusarana, maryada, sanga)
llm-reliability-starter	Reliability monitoring starter for any LLM pipeline — circuit breaker, evaluator, monitor

Built by

Darshankumar Joshi — Gujarat, India.

Building production-grade LLM infrastructure, in silence, for everyone.

4375 tests · 100 libraries · zero external dependencies · MIT licensed · production-tested

Use one. Use all. They compose.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
assets		assets
docs		docs
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

arsenal

tools built in silence, for everyone who builds AI

Why Agents Fail in Production

The Production Lifecycle

Libraries

🔀 Routing & Dispatch

🧠 Memory & Context

🛡️ Guarding & Safety

📊 Evaluation & Benchmarking

✅ Validation & Schema

👁️ Observability & Tracing

🔁 Checkpointing & Recovery

🗳️ Consensus & Coordination

📋 Planning & Workflow

💸 Budgeting & Throttling

📡 Events & Signals

🏥 Session & Lifecycle

🔄 Data & Transformation

⚙️ Configuration & Discovery

🌊 Streaming & Commands

🧪 Testing & Mocking

Quick Start

Install

What This Isn't

Why This Exists

🔗 A2A Protocol Ready

Wrapping an A2A Call with kavacha (circuit-breaker)

Why Arsenal is the Reliability Layer Under A2A Agents

Quick install

Philosophy

Related Projects

Built by

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages