Summary
Add optional middleware hooks to OpenClaw that allow runtime enforcement of agent protocols, beyond post-hoc scoring.
Problem
Current ACT (Agent Capability Test) framework scores agents after the fact. We can detect protocol violations, but cannot prevent them. This leads to:
- Self-assessment inflation — agents describe compliance but dont execute it
- WAL violations — agents respond before writing state files
- Handshake failures — agents skip the startup handshake block
- Tool proliferation — agents use direct tools instead of wrapper scripts
Desired Hooks
1. tool_choice lock (per-session)
Allow forcing a session to text-only mode for the first turn:
session.set_tool_choice("none") # First turn: text only
session.set_tool_choice("auto") # Subsequent turns: normal
Use case: Enforce handshake output before any tool calls.
2. Pre-tool-call validation
Allow intercepting tool calls before execution:
def tool_validator(call):
if not call_has_valid_nonce(call):
return {"error": "missing run_token"}
return {"ok": true}
gateway.register_tool_validator(tool_validator)
Use case: Ensure WAL (Write-Ahead Logging) is written before any substantive tool calls.
3. Turn-phase markers
Allow marking turns as protocol preamble vs substantive work:
session.begin_phase("preflight")
... protocol handshake/wal ...
session.end_phase("preflight")
session.begin_phase("work")
Use case: Separate preflight compliance from scenario work in evidence/tracking.
Why This Matters
Without runtime enforcement, were limited to:
- Post-hoc scoring (measures but doesnt prevent)
- Retry loops (works but inefficient)
- Wrapper scripts (fragile, easily bypassed)
With hooks, we can:
- Block non-compliant behavior before it happens
- Make compliance mechanically enforced vs. honor-system
- Build finite-state machines for agent behavior
Alternative Considered
We considered building this outside OpenClaw (external harness), but:
- Tool routing lives inside OpenClaw
- Session state is in OpenClaw
- The right place for these hooks is the gateway
Environment
- OpenClaw 2026.2.26
- Running ACT tests via run-test.sh
- Using Sonnet and MiniMax models
This would enable the ACT preflight gate system to work with true runtime enforcement rather than retry loops.
Summary
Add optional middleware hooks to OpenClaw that allow runtime enforcement of agent protocols, beyond post-hoc scoring.
Problem
Current ACT (Agent Capability Test) framework scores agents after the fact. We can detect protocol violations, but cannot prevent them. This leads to:
Desired Hooks
1. tool_choice lock (per-session)
Allow forcing a session to text-only mode for the first turn:
session.set_tool_choice("none") # First turn: text only
session.set_tool_choice("auto") # Subsequent turns: normal
Use case: Enforce handshake output before any tool calls.
2. Pre-tool-call validation
Allow intercepting tool calls before execution:
def tool_validator(call):
if not call_has_valid_nonce(call):
return {"error": "missing run_token"}
return {"ok": true}
gateway.register_tool_validator(tool_validator)
Use case: Ensure WAL (Write-Ahead Logging) is written before any substantive tool calls.
3. Turn-phase markers
Allow marking turns as protocol preamble vs substantive work:
session.begin_phase("preflight")
... protocol handshake/wal ...
session.end_phase("preflight")
session.begin_phase("work")
Use case: Separate preflight compliance from scenario work in evidence/tracking.
Why This Matters
Without runtime enforcement, were limited to:
With hooks, we can:
Alternative Considered
We considered building this outside OpenClaw (external harness), but:
Environment
This would enable the ACT preflight gate system to work with true runtime enforcement rather than retry loops.