feat(security): TrajectorySentinel and ScopedToolExecutor for capability governance#3588
Merged
feat(security): TrajectorySentinel and ScopedToolExecutor for capability governance#3588
Conversation
…ability governance Implements Phase 1 of spec 050 (security capability governance), addressing research findings from issues #3563 (Aethelgard), #3569 (CapSeal/SUDP spec), and #3570 (SafeAgent). TrajectorySentinel (zeph-core): - Accumulates risk signals across turns with multiplicative decay (default 0.85/turn) - 9 signal types: VigilFlagged, PolicyDeny, PiiRedaction, ToolFailure, HighCallRate, UnusualReadVolume, ToolPairTransition, OutOfScope, TrajectoryAutoRecover - 4 risk levels: Normal, Elevated, High, Critical - Hard reset to 0.0 after auto_recover_after_turns (default 16) consecutive Critical turns - Subagent score inheritance via spawn_child() with configurable inheritance factor - advance_turn() fires before gate evaluation on every turn (spec Invariant 2) - RiskAlert never exposed to LLM-callable tools (spec NEVER clause) ScopedToolExecutor (zeph-tools): - Generic wrapper enforcing per-task-type tool allow-lists from config - Mandatory namespace prefixes: builtin:, skill:, mcp:, acp:, a2a: - Build-time glob resolution to HashSet<ToolId>; strict for builtin:/skill:, provisional (re-resolved on dynamic registration) for mcp:/acp:/a2a: - scope_at_definition and scope_at_dispatch fields in AuditEntry (FR-CG-012) - Wired outermost in executor stack in runner.rs PolicyGateExecutor integration: - Reads trajectory_risk_slot atomic; overrides Allow → Deny at Critical - RiskSignalQueue decouples signal recording from sentinel to avoid circular deps - PolicyDeny and OutOfScope push signals; sanitizer pushes VigilFlagged/PiiRedaction CLI/TUI/config integration: - /trajectory status, /trajectory reset (operator-only), /scope list, /scope reset - --scope <task_type> CLI flag - [security.trajectory] and [security.capability_scopes] config sections - --init wizard step for trajectory thresholds - --migrate-config picks up new sections from default.toml automatically CapSeal/SUDP (issue #3569): spec-only in specs/050-security-capability-governance/spec.md; Phase 3 design around BoundSecret<Op> typestate documented. Closes #3563 Closes #3570 Resolves #3569
a02ebe6 to
0373a88
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
zeph-core): multi-turn heuristic risk accumulator with multiplicative decay, 4 risk levels (Normal/Elevated/High/Critical), hard reset auto-recover, and subagent score inheritance viaspawn_child(). Wired into agent loop before gate evaluation; Critical state forcesPolicyGateExecutorto deny all tool calls.zeph-tools): config-driven per-task-type tool allow-lists with mandatory namespace prefixes (builtin:,skill:,mcp:,acp:,a2a:), build-time glob resolution, and audit trail withscope_at_definition/scope_at_dispatchfields.specs/050-security-capability-governance/spec.md): unified research synthesis of Aethelgard (research(security): Aethelgard RL-learned dynamic capability governance — minimum viable tool set per task type #3563), CapSeal/SUDP (research(security): CapSeal + SUDP — capability-sealed vault-broker pattern where agents propose operations without receiving secrets #3569, Phase 3 design only), and SafeAgent (research(security): SafeAgent — runtime governed tool mediation with context-aware decision core over session trajectory #3570). Amendsspecs/010-security/spec.mdwith architectural decision carving out advisoryTrajectorySentinelaccumulation from the cross-turn prohibition.Closes
Test plan
cargo nextest run --workspace --lib --bins— 8727 tests passcargo +nightly fmt --check— cleancargo clippy --workspace --lib --bins -- -D warnings— cleancargo run --features full -- --config .local/config/testing.toml— see.local/testing/playbooks/security-capability-governance.mdfor 8 test scenarios/trajectory statusshows Normal at startup/scope listshows configured scopesRiskLeveldoes not appear in LLM-visible tool error messages