Skip to content

Fix ClickClack toolsAllow reply dispatch#90405

Merged
steipete merged 4 commits into
mainfrom
clickclack
Jun 5, 2026
Merged

Fix ClickClack toolsAllow reply dispatch#90405
steipete merged 4 commits into
mainfrom
clickclack

Conversation

@steipete

@steipete steipete commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Propagates ClickClack account toolsAllow into inbound reply dispatch so account-level tool policy reaches agent/model reply paths.
  • Threads runtime toolsAllow through shared auto-reply dispatch, provider dispatcher wrappers, embedded agent runs, reply_dispatch hook events, and ACP hook handoff.
  • Fails closed for ACP-bound sessions when a restrictive runtime toolsAllow is present, because ACPX 0.10.0 only supports allowed_tools as session options and cannot guarantee per-turn enforcement on reused persistent sessions.

Builds on and supersedes the direct ClickClack fix in #89500.

Verification

  • Live ClickClack E2E on Crabbox AWS: run_6a0472ed7e71, provider aws, id cbx_dace25addcaa.
  • Live proof covered Docker ClickClack server install/bootstrap, OpenClaw config with real OpenAI auth, outbound channel send, outbound thread reply, inbound channel model reply containing CLACK_OK, inbound thread model reply containing CLACK_OK, outbound DM send, and inbound DM model reply containing CLACK_OK.
  • WebVNC visual check opened the ClickClack app and verified authenticated UI state with #general, direct messages, bot DM, bot/human messages, and composer visible.
  • Focused Vitest: node scripts/run-vitest.mjs run src/auto-reply/reply/dispatch-acp.test.ts src/plugin-sdk/acp-runtime.test.ts src/auto-reply/reply/dispatch-from-config.reply-dispatch.test.ts src/auto-reply/dispatch.test.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/provider-dispatcher.test.ts extensions/clickclack/src/inbound.test.ts --reporter=verbose
  • Remote changed gate: node scripts/crabbox-wrapper.mjs run --provider aws --target linux -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed, run run_d32af37fb265, provider aws, id cbx_8236876017c9.
  • Autoreview: .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main clean, no accepted/actionable findings.

Notes

  • Checked sibling ../codex; no ACP runtime protocol/source path there exposes toolsAllow.
  • Checked node_modules/acpx/dist/runtime.d.ts and node_modules/acpx/dist/session-options-*.d.ts; ACPX turn input has no per-turn allowlist, while allowed_tools is a session option.

@openclaw-barnacle openclaw-barnacle Bot added size: M maintainer Maintainer-authored PR labels Jun 4, 2026
@clawsweeper

clawsweeper Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 4, 2026, 12:18 PM ET / 16:18 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +48, Tests +205. Total +253 across 19 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Review did not complete, so no work-lane recommendation was made.
Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against c0b3c8cdb9dc.

Label changes

Label changes:

  • add rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

PR surface:

Source +48, Tests +205. Total +253 across 19 files.

View PR surface stats
Area Files Added Removed Net
Source 12 51 3 +48
Tests 7 205 0 +205
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 19 256 3 +253

What I checked:

  • failure reason: timeout.
  • codex failure detail: Codex review failed for this PR: spawnSync codex ETIMEDOUT.
  • codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added the rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. label Jun 4, 2026
@steipete

steipete commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

Land-ready proof for #90405.

Work done:

  • Propagated ClickClack account toolsAllow into inbound reply dispatch.
  • Threaded runtime toolsAllow through shared reply dispatch, provider dispatch wrappers, embedded agent runs, reply_dispatch events, and ACP hook handoff.
  • Added ACP fail-closed behavior for restrictive runtime toolsAllow, because ACPX 0.10.0 exposes allowed_tools as a session option and cannot guarantee per-turn enforcement for reused ACP sessions.

Commands/proof:

  • Live ClickClack E2E: Crabbox AWS run_6a0472ed7e71, provider aws, id cbx_dace25addcaa. Covered Docker ClickClack install/bootstrap, OpenClaw config with real OpenAI auth, outbound channel, outbound thread, inbound channel model reply, inbound thread model reply, outbound DM, and inbound DM model reply. Model replies included CLACK_OK.
  • WebVNC visual check: authenticated ClickClack app showed #general, direct messages, bot DM, bot/human messages, and composer.
  • Focused Vitest: node scripts/run-vitest.mjs run src/auto-reply/reply/dispatch-acp.test.ts src/plugin-sdk/acp-runtime.test.ts src/auto-reply/reply/dispatch-from-config.reply-dispatch.test.ts src/auto-reply/dispatch.test.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/provider-dispatcher.test.ts extensions/clickclack/src/inbound.test.ts --reporter=verbose
  • Remote changed gate: node scripts/crabbox-wrapper.mjs run --provider aws --target linux -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed, run run_d32af37fb265, provider aws, id cbx_8236876017c9.
  • Autoreview: .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main, clean with no accepted/actionable findings.
  • Dependency/Codex contract checks: inspected sibling ../codex; no ACP runtime protocol/source path exposes toolsAllow. Inspected node_modules/acpx/dist/runtime.d.ts and node_modules/acpx/dist/session-options-*.d.ts; ACPX turn input has no per-turn allowlist, while allowed_tools is a session option.

CI:

  • Required checks: none reported for clickclack.
  • Check rollup audit: no active failing check found; remaining non-success conclusions were skipped/neutral/cancelled superseded runs.

Known proof gaps:

  • Did not wait for every advisory CodeQL/OpenGrep job after required checks reported none; local + Crabbox static gate and live E2E covered the touched behavior.

@steipete steipete merged commit 797bcd5 into main Jun 5, 2026
184 of 186 checks passed
@steipete steipete deleted the clickclack branch June 5, 2026 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants