Proposal: Governance Hooks for Claude Agent SDK
Summary
Add optional governance hooks to the Claude Agent SDK that enable policy enforcement, threat detection, and audit trails for tool calls — similar to what guardrails do for model outputs, but at the tool execution layer.
Problem
When building multi-agent systems with the Claude Agent SDK, there is no built-in mechanism to:
- Enforce tool-level policies (which tools can be called, with what arguments, how often)
- Detect threat patterns in tool arguments before execution (data exfiltration, privilege escalation)
- Score trust levels between agents for safe delegation
- Generate immutable audit trails of all tool calls and policy decisions
Proposed Design
from claude_agent_sdk import Agent, GovernancePolicy
# Define governance policy
policy = GovernancePolicy(
name="production-safe",
allowed_tools=["search", "read_file", "write_file"],
blocked_tools=["execute_shell", "delete_file"],
max_tool_calls=50,
content_filters=["no_pii", "no_secrets"],
threat_detection=True,
audit_trail=True,
)
# Apply to agent
agent = Agent(
name="research-agent",
tools=[search, read_file, write_file],
governance=policy, # Governance hooks intercept tool calls
)
Hook Points
- before_tool_call — Validate tool name and arguments against policy. Block if disallowed.
- after_tool_call — Audit the result, check for sensitive data leakage.
- on_delegation — When Agent A delegates to Agent B, verify trust score thresholds.
- on_policy_violation — Callback when a policy rule is triggered (for alerting/monitoring).
Why Not External Middleware?
External governance works but loses context:
- SDK-level hooks have access to the full agent context (conversation history, tool results, delegation chain)
- Hook execution order can be guaranteed (governance before business logic)
- Audit trails can capture the complete agent decision trace, not just tool I/O
Prior Art
- OpenAI Agents SDK has
guardrails (input/output guardrails on model responses)
- Google ADK has
BasePlugin with before_tool_callback / after_tool_callback
- PydanticAI has middleware proposals (#2885)
- We've built governance integrations for all of these: agentmesh-integrations
Context
We maintain Agent-OS and have filed similar governance proposals across the agent ecosystem:
Proposal: Governance Hooks for Claude Agent SDK
Summary
Add optional governance hooks to the Claude Agent SDK that enable policy enforcement, threat detection, and audit trails for tool calls — similar to what guardrails do for model outputs, but at the tool execution layer.
Problem
When building multi-agent systems with the Claude Agent SDK, there is no built-in mechanism to:
Proposed Design
Hook Points
Why Not External Middleware?
External governance works but loses context:
Prior Art
guardrails(input/output guardrails on model responses)BasePluginwithbefore_tool_callback/after_tool_callbackContext
We maintain Agent-OS and have filed similar governance proposals across the agent ecosystem: