refactor: migrate guardrails to plugin hook system by Scrattlebeard · Pull Request #1 · grayswansecurity/openclaw

Scrattlebeard · 2026-02-01T16:06:02Z

Unify the guardrails system with the existing plugin hook infrastructure:

Add before_request and after_response hooks to plugin types
Extend before_tool_call/after_tool_call with richer context and modify capabilities (after_tool_call now returns results)
Wire up all four hook stages in the agent runner and tool adapter
Move Gray Swan implementation to extensions/grayswan/ plugin
Remove guardrail registry and apply functions

The Gray Swan guardrail now follows the standard plugin pattern, registering handlers via api.on() for each stage. Configuration remains unchanged (guardrails.grayswan in openclaw.json).

This enables third-party guardrail plugins using the same hook API.

Unify the guardrails system with the existing plugin hook infrastructure: - Add before_request and after_response hooks to plugin types - Extend before_tool_call/after_tool_call with richer context and modify capabilities (after_tool_call now returns results) - Wire up all four hook stages in the agent runner and tool adapter - Move Gray Swan implementation to extensions/grayswan/ plugin - Remove guardrail registry and apply functions The Gray Swan guardrail now follows the standard plugin pattern, registering handlers via api.on() for each stage. Configuration remains unchanged (guardrails.grayswan in openclaw.json). This enables third-party guardrail plugins using the same hook API.

…guration and move config responsibility to each plugin. Extract shared guardrail-utils

Add a shared base class/factory that simplifies guardrail extension implementations by handling common hook registration boilerplate. Extensions now implement a simple interface: - evaluate(ctx, config, api) -> GuardrailEvaluation - formatViolationMessage(evaluation, location) -> string - onRegister(api, config) [optional] The factory handles: - All 4 hook registrations (before_request, before_tool_call, after_tool_call, after_response) - Stage config resolution (enabled, mode, blockMode, includeHistory) - Error handling with failOpen support - Monitor mode (log-only) - Content extraction per stage Refactored extensions: - llamaguard: 634 → 373 lines (-41%) - grayswan: 590 → 394 lines (-33%) - gpt-oss-safeguard: 550 → 311 lines (-43%) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add two new execution-safety guardrails using the createGuardrailPlugin framework: command-safety-guard: - Blocks destructive commands (rm -rf, dd, mkfs, fork bombs) - Prevents credential exfiltration (cat ~/.ssh/id_*, base64 | curl) - Detects privilege escalation attempts (sudo passwd, visudo) - Configurable: extra patterns, allow patterns, disable rules security-audit: - Restricts access to sensitive files (SSH keys, API tokens, shell configs) - Covers cloud credentials (AWS, GCloud, Azure) - Covers package manager auth (npm, PyPI) - Operation-aware: some rules block read, others block write - Configurable: extra paths, allow paths, disable rules Both plugins support monitor mode for logging without blocking. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Scrattlebeard added 2 commits February 1, 2026 17:04

Add LlamaGuard and gpt-oss-safeguard plugins. Remove guardrails confi…

5c74c85

…guration and move config responsibility to each plugin. Extract shared guardrail-utils

pauloportella mentioned this pull request Feb 1, 2026

feat: interceptor pipeline for tool, message, and params events openclaw/openclaw#6569

Closed

5 tasks

Scrattlebeard and others added 2 commits February 1, 2026 13:34

Reapor-Yurnero merged commit 965dbc0 into feat/guardrail_interface Feb 2, 2026

Reapor-Yurnero pushed a commit that referenced this pull request Feb 9, 2026

test: add /approve gateway scope coverage (#1) (thanks @mitsuhiko)

d41acf9

Reapor-Yurnero pushed a commit that referenced this pull request Feb 9, 2026

test: reset /approve mock per test (#1) (thanks @mitsuhiko)

4df4435

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: migrate guardrails to plugin hook system#1

refactor: migrate guardrails to plugin hook system#1
Reapor-Yurnero merged 4 commits intofeat/guardrail_interfacefrom
feat/guardrails_as_plugins

Scrattlebeard commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Scrattlebeard commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants