Context
Relates to PR #249 (trust context and audience policy) and the runaway tool loop
fix in #350 / PR #351.
Current security model is grant/deny per tool category — shell_execute either
has a grant or it doesn't. This is too coarse. In the incident from #350, the bot
ran git remote remove, git push, re-created files, and ignored user corrections
— all auto-approved because shell_execute has a blanket grant.
Proposal
Add command-level approval within tool grants, similar to Claude Code's permission
model:
Three tiers
Tier 1 — Auto-approve (read-only, no side effects):
file_read, web_search, web_fetch, search_tools, list_reminders
Tier 2 — Approve-on-first-use (side effects):
shell_execute per command pattern, file_write per path prefix, set_reminder.
In default mode: ask on first use, cache approval for the session. In permissive
mode: auto-approve.
Tier 3 — Always confirm (destructive / external-facing):
rm -rf, git push --force, git remote remove, creating repos, sending
external messages. Confirm even in permissive mode (unless policy override).
Approval cache
- Per-session (default): approvals reset each Slack thread
- Per-user persistent (opt-in): stored in policy config
- Pattern-based: command prefix matching, not exact string (
git * approved,
not just git push -u origin master)
Policy-level override
Users can configure approval mode in ACL config:
{
"tool_approval_mode": "default" // "default" | "permissive" | "strict"
}
Or issue in-conversation: "you have full permission" → session-level override.
Slack UX
Bot pauses mid-turn and asks inline before executing:
I'd like to run: git push -u origin master
Approve or deny?
Waits for user text response before proceeding.
Prerequisites
Beta testing approach
Deploy with default posture and run the eval suite (evals/run-evals.sh) to
measure impact. Key metrics:
- Eval pass rate before vs after (should not degrade significantly)
- Number of approval prompts per eval run (should be minimal for read-only cases)
- Complex multi-step cases must still complete without excessive interruption
If the posture degrades eval scores significantly, it's too strict. The eval
suite's per-version trend tracking (SQLite results DB) provides the comparison
baseline.
Related
Context
Relates to PR #249 (trust context and audience policy) and the runaway tool loop
fix in #350 / PR #351.
Current security model is grant/deny per tool category —
shell_executeeitherhas a grant or it doesn't. This is too coarse. In the incident from #350, the bot
ran
git remote remove,git push, re-created files, and ignored user corrections— all auto-approved because
shell_executehas a blanket grant.Proposal
Add command-level approval within tool grants, similar to Claude Code's permission
model:
Three tiers
Tier 1 — Auto-approve (read-only, no side effects):
file_read,web_search,web_fetch,search_tools,list_remindersTier 2 — Approve-on-first-use (side effects):
shell_executeper command pattern,file_writeper path prefix,set_reminder.In default mode: ask on first use, cache approval for the session. In permissive
mode: auto-approve.
Tier 3 — Always confirm (destructive / external-facing):
rm -rf,git push --force,git remote remove, creating repos, sendingexternal messages. Confirm even in permissive mode (unless policy override).
Approval cache
git *approved,not just
git push -u origin master)Policy-level override
Users can configure approval mode in ACL config:
{ "tool_approval_mode": "default" // "default" | "permissive" | "strict" }Or issue in-conversation: "you have full permission" → session-level override.
Slack UX
Bot pauses mid-turn and asks inline before executing:
Waits for user text response before proceeding.
Prerequisites
ask for approval, and hear the response before the next iteration
Beta testing approach
Deploy with default posture and run the eval suite (
evals/run-evals.sh) tomeasure impact. Key metrics:
If the posture degrades eval scores significantly, it's too strict. The eval
suite's per-version trend tracking (SQLite results DB) provides the comparison
baseline.
Related