[Feature]: Pluggable Guardrail Provider Interface for tool authorization

### Summary

A standard `GuardrailProvider` interface that lets users plug any authorization provider into `tool:before` hooks - so tool calls (exec, write, browser, MCP, messaging) can be evaluated before execution, not just exec commands via the current approvals system.


### Problem to solve

OpenClaw has exec approvals for shell commands, but **no general-purpose authorization for any other tool** — file writes, browser actions, messaging, MCP tools, git operations, etc. An agent can `write` to `~/.ssh/authorized_keys`, send messages to arbitrary recipients, or execute MCP tools with no policy check.

The community has been asking for this across **10+ issues spanning 2+ years**:

- #513 — Path-based access control rules
- #1546 — Per-group tool policies
- #6823 — Execution guardrails (agent deleted OAuth credentials)
- #8081 — Multi-user RBAC
- #12202 — Per-agent file path access control
- #30504 — Middleware hooks for agent protocol enforcement
- #28753 — Route approval prompts to originating channel

The infrastructure is partially there — PR #22068 merged `tool:before`/`tool:after` internal hook events, and the plugin system defines `before_tool_call`. But there's no standard contract for guardrail providers to implement, so every solution is ad-hoc and incompatible.

### Proposed solution

A minimal TypeScript interface that any guardrail provider can implement:

```typescript
interface GuardrailProvider {
  name: string;
  version: string;
  evaluate(request: GuardrailRequest): Promise<GuardrailDecision>;
  healthCheck?(): Promise<{ ok: boolean; message?: string }>;
}

interface GuardrailRequest {
  toolName: string;           // "exec", "write", "browser", "mcp.tool_name"
  params: Record<string, unknown>;
  agentId?: string;
  sessionId?: string;
  timestamp: string;
}

interface GuardrailDecision {
  allow: boolean;
  reasons?: Array<{ code: string; message: string }>;
  metadata?: Record<string, unknown>;  // provider-specific (audit ID, signature, etc.)
}
```

**Config:**

```yaml
guardrails:
  enabled: true
  failClosed: true
  provider: "my-guardrail-plugin"   # or "./local-guardrail.ts"
  config:
    # provider-specific settings
```

**How it works:**
1. `tool:before` fires
2. If guardrail configured, call `provider.evaluate({ toolName, params })`
3. `allow: false` → block tool, return reasons to agent
4. `allow: true` or no provider → proceed normally

**Key properties:**
- **Opt-in** — zero impact when not configured
- **Provider-agnostic** — users pick their own implementation (simple allowlist, policy engine, enterprise service)
- **Builds on existing infra** — uses `tool:before` from PR #22068, no execution pipeline changes
- **Fail-closed option** — provider errors can deny by default (configurable)


### Alternatives considered

**1. Extend exec approvals to all tools**
Tightly coupled to OpenClaw internals, requires core changes per tool category, doesn't support external providers or custom policies. The three-layer model (policy/allowlist/approval) is good for exec but doesn't generalize to tools with different parameter shapes.

**2. Full interceptor pipeline (PR #6569 approach)**
Too much scope - interceptors for tool calls, messages, and params in one PR. Was closed. A focused interface for just tool authorization is more likely to land and can be extended later.

**3. Per-plugin ad-hoc hooks**
What exists today - each plugin implements its own `before_tool_call` handler with no shared contract. Providers can't be swapped, config isn't standard, and there's no `failClosed` behavior. Works but doesn't compose.


### Impact

**Affected:** Every OpenClaw user running agents with tool access — especially multi-channel setups (Slack, Discord, WhatsApp) where agents act on behalf of users, and enterprise/team deployments where agents touch production systems.

**Severity:** Blocks workflow for security-conscious deployments. Currently the only option is exec approvals (shell only) or trusting the agent entirely for everything else.

**Frequency:** Every tool call. Agents execute tools continuously — file operations, web fetches, messaging, MCP tools. Each one is an unguarded action.

**Consequence:**
- Agents can write to sensitive paths, exfiltrate data via messaging/browser, or execute destructive MCP tools with no pre-check
- Users who need guardrails must build custom solutions with no stable interface to target
- Enterprise adoption blocked — no audit trail, no policy enforcement, no kill switch for non-exec tools
- Real incident: #6823 — agent deleted OAuth credentials; #1971 — ~900 exposed instances without auth

### Evidence/examples

**Community demand:** 10+ issues listed above, plus two substantial PRs (#6095 modular guardrails, #6569 interceptor pipeline) that were closed - indicating demand exists but prior approaches were too broad.

**Working reference:** [APort Agent Guardrails](https://github.com/APortHQ/aport-agent-guardrails) implements this pattern today as an OpenClaw plugin via `before_tool_call`. It maps tools to policies (exec → command policy, write → file policy, etc.), evaluates locally or via API, and blocks denied calls. Runs without any OpenClaw core changes - proving the interface is viable.

**External research:** Noma Security found 53% of enterprise users granted AI agents privileged access without policy controls. Cisco documented data exfiltration via third-party skills. A standard guardrail interface addresses both.

**Prior art in other ecosystems:**
- OCI Runtime Spec — container interface, any runtime
- OpenTelemetry Collector — observability interface, any backend
- CSI (Kubernetes) — storage interface, any provider
- Android/iOS permission models — capability declarations before install

### Additional information

Happy to submit a focused PR if there's interest. Scope would be:
- `GuardrailProvider` interface in `packages/types/`
- `guardrails` config section in config schema
- Wire into `tool:before` hook handling
- Docs at `docs/extensions/guardrails.md`

No bundled providers, no changes to the steerable agent loop, no opinions on policy format. Just the interface - providers bring the opinions.

Exec approvals could optionally be refactored as a built-in guardrail provider in a follow-up, unifying the model. But that's separate scope.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Pluggable Guardrail Provider Interface for tool authorization #46441

Summary

Problem to solve

Proposed solution

Alternatives considered

Impact

Evidence/examples

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Pluggable Guardrail Provider Interface for tool authorization #46441

Description

Summary

Problem to solve

Proposed solution

Alternatives considered

Impact

Evidence/examples

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions