You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today's headless modes for the 4 CLI runtimes (claude --print, gemini --output-format stream-json, codex exec --json, opencode run --format json) all auto-run any tool the agent invokes — there's no permission interception, and no way for an operator running RemoteClaw via chat to approve or deny per-call.
When a runtime is migrated to its richer permission-emitting mode (Codex app-server, OpenCode serve/acp, Gemini --acp, Claude --input-format stream-json with control_request), middleware needs to:
Capture the permission request from the CLI subprocess
Surface it as a chat message with platform-native UI (inline keyboard / Block Kit / ActionRow / quick replies)
Wait for user response (seconds to minutes)
Route the decision back to the subprocess via its native API
Acceptance criteria
Schema (in src/middleware/types.ts)
New AgentApprovalRequestEvent discriminant on the AgentEvent union:
Each runtime adapter (claude.ts / gemini.ts / codex.ts / opencode.ts) translates its native permission-request shape into AgentApprovalRequestEvent AND consumes resolvedApprovals to send the response back via the native API:
Backend-native decision vocabulary preserved verbatim in options[].nativeDecision (Codex 6 decisions including AcceptForSession / AcceptWithExecpolicyAmendment / ApplyNetworkPolicyAmendment, OpenCode 3 once/always/reject, ACP 4 allow_once/allow_always/reject_once/reject_always, Claude 2 allow/deny). No flattening to lowest common denominator.
Channel-adapter contract (in extensions/{telegram,slack,discord,whatsapp,...}/)
New channel-adapter interface method to render an AgentApprovalRequestEvent as a platform-native interactive message with options[] mapped to buttons / quick-replies
Correlation: approvalId carried in callback_data / action_id / custom_id / button id
Channel-adapter validates decidedBy ∈ ChannelMessage.authorizedSenders BEFORE forwarding decision back into AgentExecuteParams.resolvedApprovals
Plain-text fallback for channels without interactive components (SMS / iMessage / Signal): /approve {id} and /deny {id} patterns
Security must-includes
Allow-always cardinality bound: any "always" decision (Codex AcceptForSession, OpenCode always, ACP allow_always) MUST be scoped to (toolName, argsHash), NEVER toolName alone — Cursor MCP exploit pattern prevention
Sensitive-data redaction at runtime layer (NOT channel adapter): runtime produces displayArgs separate from rawInput. Channel adapters receive pre-redacted display fields and never see raw secrets
Append-only audit log: every resolved approval logs {requestId, agentId, sessionKey, toolName, argsHash, decision, decidedBy, decidedAt, channel, decisionLatencyMs} to an append-only sink
Persistence
Persistence is fully on the CLI side. The CLI subprocess holds the pending approval in its own memory while blocked on the native API. No durable middleware approval store. If the subprocess dies mid-approval, the approval is lost; mitigation is operational (follow-up "agent restarted, please retry" channel message; no technical recovery).
Tests
Unit tests per runtime adapter: native-shape ↔ AgentApprovalRequestEvent round-trip
Unit tests per channel adapter: AgentApprovalRequestEvent → platform message round-trip + decision parsing back
Integration test: end-to-end approval flow for at least one runtime + one channel pair (recommend Claude + Telegram as the smallest viable scope)
Non-goals
Long-lived subprocess implementation (separate issue: the supervisor split — this work depends on it)
Persistence of pending approvals across subprocess restart (CLI side, no recovery)
Future approvalSource values (gateway_tool, mcp_elicitation) — schema slot reserved but not implemented
Dependencies
Long-lived subprocess supervisor (separate issue): the subprocess must stay alive across the chat round-trip (seconds to minutes). Per-execute ephemeral spawn cannot span that window.
Each runtime needs to be in approval-emitting mode (Codex app-server, OpenCode serve/acp, Gemini --acp, Claude --input-format stream-json) before its adapter changes ship.
Problem
Today's headless modes for the 4 CLI runtimes (
claude --print,gemini --output-format stream-json,codex exec --json,opencode run --format json) all auto-run any tool the agent invokes — there's no permission interception, and no way for an operator running RemoteClaw via chat to approve or deny per-call.When a runtime is migrated to its richer permission-emitting mode (Codex
app-server, OpenCodeserve/acp, Gemini--acp, Claude--input-format stream-jsonwithcontrol_request), middleware needs to:Acceptance criteria
Schema (in
src/middleware/types.ts)New
AgentApprovalRequestEventdiscriminant on theAgentEventunion:New resolution channel on
AgentExecuteParams:Per-runtime translation
Each runtime adapter (
claude.ts/gemini.ts/codex.ts/opencode.ts) translates its native permission-request shape intoAgentApprovalRequestEventAND consumesresolvedApprovalsto send the response back via the native API:control_request→ emit;control_response← consume (requires--input-format stream-jsonmode)requestPermission(per-session JSON-RPC bidirectional) — requires--acpmodeitem/commandExecution/requestApprovalanditem/fileChange/requestApproval— requiresapp-servermodepermission.askedSSE → emit;POST /permission/:id/reply← consume — requiresservemodeBackend-native decision vocabulary preserved verbatim in
options[].nativeDecision(Codex 6 decisions includingAcceptForSession/AcceptWithExecpolicyAmendment/ApplyNetworkPolicyAmendment, OpenCode 3once/always/reject, ACP 4allow_once/allow_always/reject_once/reject_always, Claude 2allow/deny). No flattening to lowest common denominator.Channel-adapter contract (in
extensions/{telegram,slack,discord,whatsapp,...}/)AgentApprovalRequestEventas a platform-native interactive message withoptions[]mapped to buttons / quick-repliesapprovalIdcarried incallback_data/action_id/custom_id/ buttoniddecidedBy ∈ ChannelMessage.authorizedSendersBEFORE forwarding decision back intoAgentExecuteParams.resolvedApprovalsX-Slack-Signature, Telegramsecret_token, Discord Ed25519)/approve {id}and/deny {id}patternsSecurity must-includes
AcceptForSession, OpenCodealways, ACPallow_always) MUST be scoped to(toolName, argsHash), NEVERtoolNamealone — Cursor MCP exploit pattern preventiondisplayArgsseparate fromrawInput. Channel adapters receive pre-redacted display fields and never see raw secrets{requestId, agentId, sessionKey, toolName, argsHash, decision, decidedBy, decidedAt, channel, decisionLatencyMs}to an append-only sinkPersistence
Persistence is fully on the CLI side. The CLI subprocess holds the pending approval in its own memory while blocked on the native API. No durable middleware approval store. If the subprocess dies mid-approval, the approval is lost; mitigation is operational (follow-up "agent restarted, please retry" channel message; no technical recovery).
Tests
AgentApprovalRequestEventround-tripAgentApprovalRequestEvent→ platform message round-trip + decision parsing backNon-goals
approvalSourcevalues (gateway_tool,mcp_elicitation) — schema slot reserved but not implementedDependencies
app-server, OpenCodeserve/acp, Gemini--acp, Claude--input-format stream-json) before its adapter changes ship.Effort
5-10 days. Spec-first; lands per-runtime incrementally.