Skip to content

新增 Guardian Subagent#3695

Open
taibai233 wants to merge 12 commits into
esengine:main-v2from
taibai233:feature/guardian-subagent
Open

新增 Guardian Subagent#3695
taibai233 wants to merge 12 commits into
esengine:main-v2from
taibai233:feature/guardian-subagent

Conversation

@taibai233

@taibai233 taibai233 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

摘要

新增 Guardian —— 一个用 LLM 替代人工审批的自动化安全审查层。当权限策略返回 Ask 时,Guardian 子智能体自动评估操作安全性并返回 allow/deny,无需打断用户。

详细内容

新增文件

• internal/guardian/guardian.go — GuardianSession 核心:跨 Turn 复用 session(prefix cache 持续命中)、delta 转录模式、断路器(连续 3 次 / 最近 50 条中 10 次 deny → 中断 turn)、fail-closed 错误处理、每 50 次审查自动 compact
• internal/guardian/transcript.go — 从 Agent Session 提取精简转录:user 锚点 + 双层 token 预算(message 10k / tool 10k)、cursor 追踪 delta,runes 安全截断
• internal/guardian/policy.go — 结构化输出解析(直接 JSON + prose 包裹容错)、enforcePolicyRules(critical→强制 deny、high+low_auth→强制 deny)、go:embed 策略文件
• internal/guardian/guardian_policy.md — Guardian 策略 prompt:角色隔离声明、风险等级定义、用户授权评分、outcome 规则
• internal/guardian/guardian_test.go — 32 个单元测试,覆盖解析、转录、断路器、策略强制执行

关键设计

• Transcript 隔离 — 转录和审查请求拆成独立 user message,API 层消息边界隔离证据和判决,防止 Guardian 模型将 agent 对话误认为自己的对话
• 跨 Turn 复用 — Guardian Session 持有独立 Agent instance,system prompt 不变 + delta 追加 → prefix cache 持续命中
• 结构化反馈 — deny 时返回 guardian denied: risk=high, auth=low. 该操作会删除... 而非通用"用户拒绝"文案

配置

│ [agent]
│ guardian_model = "deepseek-flash"
│ guardian_temperature = 0.0

不配置则 Guardian 禁用,退回到人工审批(向后兼容)。Guardian 用独立 provider + 只读工具注册表,构建时自带 clientId: 0 避免自身递归触发审批。

改动文件

• config.go — +GuardianModel、+GuardianTemperature
• event.go — +GuardianAssessment Kind + GuardianResult 结构体(含 Usage/Pricing)
• controller.go — gateApprover 路由 Guardian,ResetTurn 清理断路器
• permission.go — Gate.lastDenyReason + SetDenyReason 传递 Guardian 拒绝理由
• boot.go — 构造 GuardianSession,注入定价,启动时 notice 反馈
• chat_tui.go — Guardian 审计行 + 统一 FormatUsageLine 显示 token/花费
• serve/wire.go — SSE 事件映射
• acp/dispatch.go — ACP 消息块

测试

32 个单元测试全部通过,race detector 干净,全量编译无错误。

Closes #3582 Closes #3558


在Windows上和MAC Desktop上是啥表现我不知道,有没有人有空帮我看一下呀。

taibai added 2 commits June 9, 2026 16:52
Guardian replaces interactive human approval for 'ask' permission decisions
with an automated sub-agent backed by a dedicated safety policy.

- internal/guardian/guardian.go: GuardianSession with cross-turn session
  reuse, delta transcript mode for cache-hit, circuit breaker
  (3 consecutive / 10 recent denials -> interrupt), fail-closed errors,
  structured GuardianResult event emission
- internal/guardian/transcript.go: compact transcript extraction from
  agent session with user-anchored dual-budget selection, cursor-based
  delta tracking for DeepSeek prefix cache
- internal/guardian/policy.go: embedded guardian_policy.md, JSON
  assessment parsing (direct + prose-wrapped), structured deny reasons
- internal/guardian/guardian_policy.md: safety review policy defining
  risk taxonomy, user authorization model, and outcome rules
- internal/guardian/guardian_test.go: 28 unit tests covering parser,
  transcript, circuit breaker, config hash, subject extraction

Configuration: set [agent] guardian_model in reasonix.toml to enable.
Guardian runs on a separate model with read-only tools for context-aware
decisions; fallback to human approval when guardian_model is empty.
- Split transcript and action into separate user messages (API-level
  message boundary isolates evidence from judgment, preventing the
  guardian model from treating agent conversation as its own dialogue)
- enforcePolicyRules: critical->force deny, high+low_auth->force deny
- Gate.lastDenyReason + SetDenyReason: guardian denial reasons now
  reach the agent model (was dropped as generic 'user declined')
- GuardianAssessment event handled in chat_tui, serve/wire, acp/dispatch
- GuardianResult carries Usage+Pricing, rendered via FormatUsageLine
  for consistent display with main agent cost lines
- Guardian agent sink (newSink) silently captures Usage for cost
  reporting, never overwrites main agent TUI statistics
- NewSession takes pricing parameter for ¥ display
- Compact guardian session every 50 reviews with ContextWindow=100k
- boot.go emits enable/error notices visible in TUI startup
- 32 unit tests green, race detector clean
@github-actions github-actions Bot added v2 Go rewrite (1.x) — main-v2 branch, active development tui Terminal UI / CLI (internal/cli, internal/control) agent Core agent loop (internal/agent, internal/control) config Configuration & setup (internal/config) labels Jun 9, 2026
taibai added 4 commits June 9, 2026 19:41
- guardian: add Save/Load/Reset for prefix-cache persistence across restarts
- guardian: PathFor derives .guardian.jsonl from main session path
- controller: track guardianPath, auto-persist on Snapshot, auto-load on Resume/SwitchBranch
- controller: reset guardian on NewSession/ClearSession/Fork/Branch
- controller: remove guardian file on ClearSession
- boot: add resolvePricing to fall back to built-in defaults when user config omits price
- cli: color guardian outcome — green allow, red deny

@esengine esengine left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed in depth. The pattern is acceptable — a config-gated system hook, not a model-invocable spawn tool; bounded, fail-closed, serialized — and the craft is genuinely good (delta transcripts for cache warmth, code-level policy backstops, 32 real invariant tests). Blocking asks:

  1. Match #3558's design: on dangerous/uncertain, escalate to the human prompt with the guardian's risk annotation attached, instead of auto-denying. As shipped, guardian-on means the user can never approve a legitimately wanted high-risk action short of typing a new message.
  2. resolvePricing in boot.go changes pricing fallback globally for executor/planner/task/skills — separate PR; matching by model name can mis-price a third-party provider serving a same-named model.
  3. Gate.SetDenyReason is an unsynchronized side channel — concurrent Asks can cross-talk. Thread the reason through the Approve return path.
  4. Wire OutputSchema() into the request or delete it; and call NeedsRebuild before Load on resume/branch-switch — right now a stale guardian session loads unchecked, which is exactly the staleness the helper warns about.
  5. Trivial rebase (event.go kindNames / wire.go collided with MCPSurfaceReady); keep GuardianAssessment last in the Kind enum. Rename ParsOutcomeParseOutcome.
  6. Description nit: the circuit breaker doesn't interrupt the turn — cbInterrupt only changes the deny text. Either make it interrupt or soften the claim.

taibai added 4 commits June 11, 2026 20:56
…ubagent

# Conflicts:
#	desktop/frontend/package.json
#	desktop/frontend/src/lib/types.ts
#	desktop/wire.go
#	desktop/wire_test.go
#	internal/boot/boot.go
#	internal/control/controller.go
#	internal/event/event.go
#	internal/serve/wire.go
@github-actions github-actions Bot added the desktop Wails desktop app (desktop/**) label Jun 12, 2026
@taibai233

Copy link
Copy Markdown
Contributor Author

本次更新已推送到 feature/guardian-subagent,当前 head 为 9cf4e86c

这次主要做了几类调整:

  1. 同步最新 origin/main-v2 到 Guardian 分支,base 已更新到 f48a2411,并解决了和 upstream 的冲突。Steer 事件、memory remember/forget 审批展示、resume cold-prune 等 upstream 逻辑都已保留。

  2. 修复 Guardian review 相关反馈:

    • risk_level / user_authorization 现在会做 trim + lowercase + 白名单校验,未知枚举 fail closed。
    • Guardian session Load 会校验当前 policy/system prompt,不匹配时 reset,避免旧策略继续生效。
    • branch switch / resume 前会 reset Guardian,再尝试加载对应 .guardian.jsonl,避免跨 branch 复用旧上下文。
    • Guardian deny 会继续走人工审批弹窗,并把风险说明作为 approval reason 透传。
  3. 同步 Desktop wire/frontend 合约:

    • Desktop wire 现在包含 approval.reasonguardian_assessment payload。
    • TS 类型补齐 guardian_assessmentmcp_surface_readysteer
    • Desktop approval modal 会展示 Guardian reason,Guardian assessment 也会作为 notice 进入 transcript。

本地验证:

  • go test ./internal/guardian ./internal/control ./internal/serve ./internal/acp -run ...
  • go test ./internal/... -run '^$'
  • make build

以上均已通过。Desktop 侧本次只同步了 wire/frontend 合约,尚未完成本地 Desktop 构建验证。

…ubagent

# Conflicts:
#	desktop/frontend/package.json
#	desktop/frontend/src/lib/useController.ts
#	internal/bot/render.go
#	internal/control/controller.go
#	internal/control/controller_test.go
…ubagent

# Conflicts:
#	desktop/frontend/package.json
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Core agent loop (internal/agent, internal/control) config Configuration & setup (internal/config) desktop Wails desktop app (desktop/**) tui Terminal UI / CLI (internal/cli, internal/control) v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: 提示式的 allow 权限设计实际已经过时了 [Feature]: 希望添加agent guandran

2 participants