Governance hooks: policy enforcement and audit trails for tool calls

## Proposal: Governance Hooks for Claude Agent SDK

### Summary

Add optional governance hooks to the Claude Agent SDK that enable policy enforcement, threat detection, and audit trails for tool calls — similar to what guardrails do for model outputs, but at the tool execution layer.

### Problem

When building multi-agent systems with the Claude Agent SDK, there is no built-in mechanism to:

1. Enforce tool-level policies (which tools can be called, with what arguments, how often)
2. Detect threat patterns in tool arguments before execution (data exfiltration, privilege escalation)
3. Score trust levels between agents for safe delegation
4. Generate immutable audit trails of all tool calls and policy decisions

### Proposed Design

```python
from claude_agent_sdk import Agent, GovernancePolicy

# Define governance policy
policy = GovernancePolicy(
    name="production-safe",
    allowed_tools=["search", "read_file", "write_file"],
    blocked_tools=["execute_shell", "delete_file"],
    max_tool_calls=50,
    content_filters=["no_pii", "no_secrets"],
    threat_detection=True,
    audit_trail=True,
)

# Apply to agent
agent = Agent(
    name="research-agent",
    tools=[search, read_file, write_file],
    governance=policy,  # Governance hooks intercept tool calls
)
```

### Hook Points

1. **before_tool_call** — Validate tool name and arguments against policy. Block if disallowed.
2. **after_tool_call** — Audit the result, check for sensitive data leakage.
3. **on_delegation** — When Agent A delegates to Agent B, verify trust score thresholds.
4. **on_policy_violation** — Callback when a policy rule is triggered (for alerting/monitoring).

### Why Not External Middleware?

External governance works but loses context:
- SDK-level hooks have access to the full agent context (conversation history, tool results, delegation chain)
- Hook execution order can be guaranteed (governance before business logic)
- Audit trails can capture the complete agent decision trace, not just tool I/O

### Prior Art

- OpenAI Agents SDK has `guardrails` (input/output guardrails on model responses)
- Google ADK has `BasePlugin` with `before_tool_callback` / `after_tool_callback`
- PydanticAI has middleware proposals ([#2885](https://github.com/pydantic/pydantic-ai/issues/2885))
- We've built governance integrations for all of these: [agentmesh-integrations](https://github.com/imran-siddique/agentmesh-integrations)

### Context

We maintain [Agent-OS](https://github.com/imran-siddique/agent-os) and have filed similar governance proposals across the agent ecosystem:
- [anthropics/skills #412](https://github.com/anthropics/skills/issues/412) — Governance skill
- [google/adk-python #4543](https://github.com/google/adk-python/issues/4543) — GovernancePlugin
- [pydantic/pydantic-ai #4335](https://github.com/pydantic/pydantic-ai/issues/4335) — Governance middleware

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Governance hooks: policy enforcement and audit trails for tool calls #587

Proposal: Governance Hooks for Claude Agent SDK

Summary

Problem

Proposed Design

Hook Points

Why Not External Middleware?

Prior Art

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Governance hooks: policy enforcement and audit trails for tool calls #587

Description

Proposal: Governance Hooks for Claude Agent SDK

Summary

Problem

Proposed Design

Hook Points

Why Not External Middleware?

Prior Art

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions