You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A standard GuardrailProvider interface that lets users plug any authorization provider into tool:before hooks - so tool calls (exec, write, browser, MCP, messaging) can be evaluated before execution, not just exec commands via the current approvals system.
Problem to solve
OpenClaw has exec approvals for shell commands, but no general-purpose authorization for any other tool — file writes, browser actions, messaging, MCP tools, git operations, etc. An agent can write to ~/.ssh/authorized_keys, send messages to arbitrary recipients, or execute MCP tools with no policy check.
The community has been asking for this across 10+ issues spanning 2+ years:
The infrastructure is partially there — PR #22068 merged tool:before/tool:after internal hook events, and the plugin system defines before_tool_call. But there's no standard contract for guardrail providers to implement, so every solution is ad-hoc and incompatible.
Proposed solution
A minimal TypeScript interface that any guardrail provider can implement:
Fail-closed option — provider errors can deny by default (configurable)
Alternatives considered
1. Extend exec approvals to all tools
Tightly coupled to OpenClaw internals, requires core changes per tool category, doesn't support external providers or custom policies. The three-layer model (policy/allowlist/approval) is good for exec but doesn't generalize to tools with different parameter shapes.
2. Full interceptor pipeline (PR #6569 approach)
Too much scope - interceptors for tool calls, messages, and params in one PR. Was closed. A focused interface for just tool authorization is more likely to land and can be extended later.
3. Per-plugin ad-hoc hooks
What exists today - each plugin implements its own before_tool_call handler with no shared contract. Providers can't be swapped, config isn't standard, and there's no failClosed behavior. Works but doesn't compose.
Impact
Affected: Every OpenClaw user running agents with tool access — especially multi-channel setups (Slack, Discord, WhatsApp) where agents act on behalf of users, and enterprise/team deployments where agents touch production systems.
Severity: Blocks workflow for security-conscious deployments. Currently the only option is exec approvals (shell only) or trusting the agent entirely for everything else.
Frequency: Every tool call. Agents execute tools continuously — file operations, web fetches, messaging, MCP tools. Each one is an unguarded action.
Consequence:
Agents can write to sensitive paths, exfiltrate data via messaging/browser, or execute destructive MCP tools with no pre-check
Users who need guardrails must build custom solutions with no stable interface to target
Enterprise adoption blocked — no audit trail, no policy enforcement, no kill switch for non-exec tools
Community demand: 10+ issues listed above, plus two substantial PRs (#6095 modular guardrails, #6569 interceptor pipeline) that were closed - indicating demand exists but prior approaches were too broad.
Working reference:APort Agent Guardrails implements this pattern today as an OpenClaw plugin via before_tool_call. It maps tools to policies (exec → command policy, write → file policy, etc.), evaluates locally or via API, and blocks denied calls. Runs without any OpenClaw core changes - proving the interface is viable.
External research: Noma Security found 53% of enterprise users granted AI agents privileged access without policy controls. Cisco documented data exfiltration via third-party skills. A standard guardrail interface addresses both.
Prior art in other ecosystems:
OCI Runtime Spec — container interface, any runtime
OpenTelemetry Collector — observability interface, any backend
CSI (Kubernetes) — storage interface, any provider
Android/iOS permission models — capability declarations before install
Additional information
Happy to submit a focused PR if there's interest. Scope would be:
GuardrailProvider interface in packages/types/
guardrails config section in config schema
Wire into tool:before hook handling
Docs at docs/extensions/guardrails.md
No bundled providers, no changes to the steerable agent loop, no opinions on policy format. Just the interface - providers bring the opinions.
Exec approvals could optionally be refactored as a built-in guardrail provider in a follow-up, unifying the model. But that's separate scope.
Summary
A standard
GuardrailProviderinterface that lets users plug any authorization provider intotool:beforehooks - so tool calls (exec, write, browser, MCP, messaging) can be evaluated before execution, not just exec commands via the current approvals system.Problem to solve
OpenClaw has exec approvals for shell commands, but no general-purpose authorization for any other tool — file writes, browser actions, messaging, MCP tools, git operations, etc. An agent can
writeto~/.ssh/authorized_keys, send messages to arbitrary recipients, or execute MCP tools with no policy check.The community has been asking for this across 10+ issues spanning 2+ years:
The infrastructure is partially there — PR #22068 merged
tool:before/tool:afterinternal hook events, and the plugin system definesbefore_tool_call. But there's no standard contract for guardrail providers to implement, so every solution is ad-hoc and incompatible.Proposed solution
A minimal TypeScript interface that any guardrail provider can implement:
Config:
How it works:
tool:beforefiresprovider.evaluate({ toolName, params })allow: false→ block tool, return reasons to agentallow: trueor no provider → proceed normallyKey properties:
tool:beforefrom PR Add tool:before/tool:after internal hook events #22068, no execution pipeline changesAlternatives considered
1. Extend exec approvals to all tools
Tightly coupled to OpenClaw internals, requires core changes per tool category, doesn't support external providers or custom policies. The three-layer model (policy/allowlist/approval) is good for exec but doesn't generalize to tools with different parameter shapes.
2. Full interceptor pipeline (PR #6569 approach)
Too much scope - interceptors for tool calls, messages, and params in one PR. Was closed. A focused interface for just tool authorization is more likely to land and can be extended later.
3. Per-plugin ad-hoc hooks
What exists today - each plugin implements its own
before_tool_callhandler with no shared contract. Providers can't be swapped, config isn't standard, and there's nofailClosedbehavior. Works but doesn't compose.Impact
Affected: Every OpenClaw user running agents with tool access — especially multi-channel setups (Slack, Discord, WhatsApp) where agents act on behalf of users, and enterprise/team deployments where agents touch production systems.
Severity: Blocks workflow for security-conscious deployments. Currently the only option is exec approvals (shell only) or trusting the agent entirely for everything else.
Frequency: Every tool call. Agents execute tools continuously — file operations, web fetches, messaging, MCP tools. Each one is an unguarded action.
Consequence:
Evidence/examples
Community demand: 10+ issues listed above, plus two substantial PRs (#6095 modular guardrails, #6569 interceptor pipeline) that were closed - indicating demand exists but prior approaches were too broad.
Working reference: APort Agent Guardrails implements this pattern today as an OpenClaw plugin via
before_tool_call. It maps tools to policies (exec → command policy, write → file policy, etc.), evaluates locally or via API, and blocks denied calls. Runs without any OpenClaw core changes - proving the interface is viable.External research: Noma Security found 53% of enterprise users granted AI agents privileged access without policy controls. Cisco documented data exfiltration via third-party skills. A standard guardrail interface addresses both.
Prior art in other ecosystems:
Additional information
Happy to submit a focused PR if there's interest. Scope would be:
GuardrailProviderinterface inpackages/types/guardrailsconfig section in config schematool:beforehook handlingdocs/extensions/guardrails.mdNo bundled providers, no changes to the steerable agent loop, no opinions on policy format. Just the interface - providers bring the opinions.
Exec approvals could optionally be refactored as a built-in guardrail provider in a follow-up, unifying the model. But that's separate scope.