fix(cron): exclude sandbox from shallow merge in isolated agent config#14556
fix(cron): exclude sandbox from shallow merge in isolated agent config#14556seheepeak wants to merge 1 commit intoopenclaw:mainfrom
Conversation
244e19e to
f54da17
Compare
When a per-agent sandbox config (e.g. docker overrides) exists in agents.list[].sandbox, the Object.assign shallow merge in runCronIsolatedAgentTurn overwrites the entire agents.defaults.sandbox object — losing mode, scope, and workspaceAccess fields. This causes resolveSandboxConfigForAgent to fall back to mode:"off", resulting in cron isolated agents executing on the host instead of inside the sandbox container. Fix: exclude sandbox from the destructured agent config override. Sandbox resolution is already handled separately by resolveSandboxConfigForAgent, which properly merges global defaults with per-agent overrides. Closes openclaw#4171
f54da17 to
99517e4
Compare
bfc1ccb to
f92900f
Compare
When a per-agent sandbox config contains only docker overrides (e.g. workdir, network, binds), the Object.assign shallow merge replaces the entire agents.defaults.sandbox — losing mode, scope, and workspaceAccess. This causes resolveSandboxConfigForAgent to fall back to mode: 'off', resulting in cron isolated agents executing directly on the host instead of inside the sandbox. Exclude sandbox from the destructured agent config override since sandbox resolution is already handled separately by resolveSandboxConfigForAgent, which properly merges global defaults with per-agent overrides. Fixes openclaw#4171 Related: openclaw#14556
|
We're running into this exact issue on our multi-agent setup (gateway in Docker, Spent a few hours debugging this today — the symptoms are confusing because sandbox containers exist but stay stopped, and exec just falls back to in-gateway with no error. The only signal is that Docker service names resolve but This fix is correct. Would love to see this merged — happy to help test if needed. |
…openclaw#69) * fix: sandbox mount paths + DNS for gateway-in-Docker - Add per-agent sandbox binds using HOST_WORKSPACE_DIR (host paths) so sandbox containers can read workspace files - Add explicit DNS (8.8.8.8) to daemon and gateway containers to avoid host systemd-resolved failures * feat: enable heartbeat (15m), add message tool to sandbox, fix telegram bot token fallback - Set default heartbeat interval to 15m (was 0m/disabled) - Add 'message' to sandbox tool allow list so agents can send Telegram messages - Set top-level botToken fallback from first agent's token to prevent 'Telegram bot token missing' error when agents use message tool * revert: remove message tool and botToken fallback Keep scope to heartbeat enablement only. Task comms will go through the daemon task API instead of direct agent messaging. * feat: task API v1 — daemon endpoints + Convex backend Minimum viable task API for agent heartbeat integration: Convex: - Updated task schema: to_do/in_progress/review/done/blocked statuses, single assigneeId (required), reviewerIds array, per-reviewer status - Added tasks.ts: listByAgent, getWithMessages, pickup, addMessage - Added HTTP routes: /api/daemon/tasks, tasks/get, tasks/pickup, tasks/message - Added template update endpoint: /api/daemon/templates/update - Added seed:createTestTask for testing - Schema compat: added browserEnabled to harbors, reviewers to tasks Daemon: - New tasks.ts module: HTTP proxy from agent sandbox to Convex - Routes: GET /tasks, GET /tasks/:id, POST /tasks/:id/pickup, POST /tasks/:id/message - Agent identification via X-Agent-ID header Templates: - Updated HEARTBEAT.md to use daemon task API (curl commands) - Agents check localhost:4747/tasks on wake, pick up highest priority to_do * feat: Kanban board UI for task management - Full-width Kanban board with columns: To Do, In Progress, Review, Done, Blocked - Task cards show title, priority badge (color-coded), assignee name - Create task modal: title, description, priority, assignee, reviewers (multi-select) - Task detail modal: full info + message timeline - Convex: list/create/getMessages queries for frontend - Route: /tasks added to harbor routes (first in sidebar) * fix: reactive task detail modal + markdown rendering - Task detail modal now reads live data from Convex query instead of stale state snapshot — status updates appear in real time - Added react-markdown for rendering description and message content - Markdown styles for code blocks, lists, headings * ui: rename Blocked column to Waiting * feat: review flow — submit, approve, request changes Convex: - tasks.submit: in_progress → review, sets reviewer statuses to pending - tasks.review: approve or request changes per reviewer - All approved → done - Any changes_requested → back to in_progress, reset all statuses - HTTP routes for /api/daemon/tasks/submit and /api/daemon/tasks/review Daemon: - POST /tasks/:id/submit and /tasks/:id/review endpoints HEARTBEAT.md: - Updated template with assignee vs reviewer instructions - Agents now know how to submit work and review others' work * fix: daemon preserves per-agent heartbeat config during sync * fix: prevent assignee from being a reviewer - Convex: filter assigneeId out of reviewerIds on create - UI: hide assignee from reviewer checklist, clear if assignee changes * feat: block/unblock flow with required human message - Agents can block tasks via POST /tasks/:id/block with a reason - Task stores previousStatus for return after unblock - Human unblocks in UI with required message explaining what changed - Unblock returns task to previous status - Assignee cannot be a reviewer (enforced in Convex + UI) * feat: daemon-managed isolated cron heartbeats Replace built-in heartbeat with isolated cron sessions to prevent pattern lock. Each heartbeat runs in a fresh session — no history, no HEARTBEAT_OK muscle memory. - Add daemon/src/cron.ts: syncCronJobs creates/removes heartbeat cron jobs per agent via gateway RPC (idempotent) - Wire syncCronJobs into daemon tick loop - Disable built-in heartbeat in default config (every: 0m) - Configurable interval via HEARTBEAT_INTERVAL_MS env var * fix: heartbeat interval to 15m, add block/waiting instructions to HEARTBEAT.md - Default heartbeat interval changed from 60s to 15m (900000ms) - HEARTBEAT.md now instructs agents to use POST /tasks/:id/block when stuck or needing human input instead of leaving task in_progress - Also added submit-for-review instructions to HEARTBEAT.md * fix: use Docker service name for daemon URL in cron heartbeat The exec tool runs inside the gateway container (bridge network), not the sandbox (host network). localhost:4747 doesn't resolve to the daemon from the gateway container. Use 'daemon:4747' (Docker service name) instead. Also improved heartbeat message with explicit MUST-use-exec instruction and configurable DAEMON_INTERNAL_URL env var for non-Docker deployments. * fix(daemon): use localhost for daemon URL in heartbeat messages Sandbox containers use host networking, so the Docker Compose service name 'daemon' doesn't resolve. Use localhost:4747 instead since the daemon port is published to 127.0.0.1:4747. * fix: unblock message attribution with fromLabel field - Add optional fromLabel field to messages schema - Set fromLabel='Admin User' on unblock messages instead of '(human)' prefix - Display fromLabel when present in frontend message rendering - Add message timestamps and header layout in task detail view * feat: leader roles can create tasks via daemon API - Add POST /tasks/create daemon endpoint with leader role validation - Add POST /api/daemon/tasks/create Convex HTTP route - Add internalCreateTask mutation resolving sessionKeys to agent IDs - Pass agents list to handleTaskRequest for role lookup - Update heartbeat message with task creation instructions for leaders - Leader roles: project-manager, executive-assistant * feat: activity timestamps on tasks and messages - Add updatedAt field to tasks schema, set on every status change - Show relative time (e.g. '2h ago') on task cards and detail view - Show timestamps on each message in task detail - Show 'Updated Xm ago' on task cards when updatedAt is set * fix(frontend): style timestamp elements to match project conventions Add CSS for kanban-card-time, kanban-message-header, and kanban-message-time to use consistent 0.75rem/muted styling. * fix(frontend): remove capitalize on detail values, add 'Created' prefix on card timestamps * fix(tasks): skip review and go straight to done when no reviewers assigned * fix(frontend): tighten markdown paragraph spacing, capitalize status values * fix(frontend): hide empty paragraphs, cap modal height with scroll, tighten list spacing * fix(frontend): remove pre-wrap from description text, let ReactMarkdown handle spacing * fix(daemon): update existing cron jobs when heartbeat interval changes Include interval in sync fingerprint so changes trigger re-sync. Update existing jobs via cron.update instead of only creating new ones. * feat(tasks): sort by priority then oldest first Query returns tasks sorted urgent > high > medium > low, then by creation time. Heartbeat prompt instructs agents to pick first to_do. * fix(frontend): sticky sidebar and column headers, scrollable task area App layout uses fixed viewport height. Sidebar stays fixed, column headers stick to top while task cards scroll. * fix(frontend): split kanban into fixed headers + scrollable card area Headers row sits outside scroll container so they stay pinned. Board area scrolls independently. * fix(frontend): prevent app-main from scrolling, only kanban-board scrolls * fix(frontend): fix flex chain for kanban scroll - use flex: 1 1 0 and overflow hidden on wrapper * fix(frontend): use sticky headers within scrollable app-main, remove wrapper Simpler approach - kanban-headers are sticky within the naturally scrolling app-main container. * fix(frontend): add padding to sticky headers to prevent overlap with cards * fix(frontend): use correct background color (color-navy) for sticky headers * fix(frontend): extend sticky header coverage with box-shadow and more padding * fix(frontend): drop sticky headers, keep sidebar fixed + content scrolling * feat(tasks): only show done tasks from the last 24 hours on the board * fix(tasks): use updatedAt for done task 24h filter * feat: add delete task button and heartbeat error handling - Add error handling hint to heartbeat message (reply HEARTBEAT_OK if curl fails) - Add tasks.remove mutation that deletes task and all its messages - Add Delete Task button to TaskDetailModal with confirm dialog * fix(frontend): styled delete task with inline confirmation matching AgentsPage pattern * fix(daemon): add review instructions to heartbeat prompt Agents now know to review tasks where _agentRole is 'reviewer' and approve or request changes. * fix(daemon): revert to daemon:4747 — cron exec runs in gateway container, not sandbox Sandbox containers are not being created for cron sessions despite mode=all config. Exec falls back to running inside the gateway container which is on the compose bridge network. * fix(daemon): restore localhost:4747 — sandbox uses host networking for cron sessions * feat(frontend): sort Done column by newest first * feat(daemon): auto-detect and force-run stale cron jobs, update payload on sync * fix(daemon): use daemon:4747 — sandbox containers don't start for cron sessions * feat: disable sandboxing — use localhost:4747, guard sandbox config injection * feat: disable sandboxing — set mode off in default config * fix(daemon): use daemon:4747 — exec runs inside gateway container when sandbox is off * feat: build gateway from openclaw fork with cron sandbox fix - Add harborworks/openclaw as submodule (branch: fix/cron-sandbox-shallow-merge) - Build gateway image from source instead of npm - Re-enable sandbox (mode: all, scope: agent) - Switch DAEMON_URL back to localhost:4747 (sandbox uses host networking) Fixes the cron sandbox bug where Object.assign shallow merge clobbered sandbox.mode, causing cron sessions to run unsandboxed. See: openclaw#4171, openclaw#14556 * fix: pin openclaw submodule to v2026.2.13 + cron sandbox fix Build gateway from source (v2026.2.13 tag + sandbox shallow merge fix) instead of npm to avoid auth scope incompatibility with newer versions. Sandbox mode: all, DAEMON_URL: localhost:4747 (host networking) * fix: remove unused internalMutation import (fixes CI type check) * fix(ci): fetch submodules in deploy, fix gateway build context, use HTTPS for submodule URL
|
Gentle ping — any chance this could get a review? CI is green and mergeable. Happy to address any feedback. |
Summary
sandboxconfig contains onlydockeroverrides (e.g.workdir,network,binds), theObject.assignshallow merge inrunCronIsolatedAgentTurnreplaces the entireagents.defaults.sandbox— losingmode,scope, andworkspaceAccess. This causesresolveSandboxConfigForAgentto fall back tomode: "off", resulting in cron isolated agents executing directly on the host instead of inside the sandbox container.sandboxfrom the destructured agent config override. Sandbox resolution is already handled separately byresolveSandboxConfigForAgent, which properly merges global defaults with per-agent overrides viaagents.list[].sandbox.Reproduction
agents.defaults.sandbox.mode: "all"andscope: "agent"sandbox.dockeroverride (e.g. customworkdir,network, orbinds) inagents.list[]sessionTarget: "isolated"for that agenthostname && whoami— it prints the host hostname and user instead of the container'sRoot cause
Related: #4171
Greptile Overview
Greptile Summary
This change updates the cron isolated-agent runner to avoid shallow-merging a per-agent
sandboxoverride intoagents.defaults.Previously,
runCronIsolatedAgentTurnbuiltcfgWithAgentDefaultsby doing a top-levelObject.assign({}, agents.defaults, agentOverrideRest). If the per-agent override included onlysandbox.dockerkeys, that shallow merge replaced the entiresandboxobject and droppedmode/scope/workspaceAccess, causing downstream sandbox resolution to fall back tomode: "off"and run cron “isolated” turns on the host.The fix destructures
sandboxout of the per-agent override before the shallow merge, relying on the existingresolveSandboxConfigForAgentpath (which mergesagents.defaults.sandboxwithagents.list[].sandboxfield-by-field) to handle sandbox configuration correctly.Confidence Score: 5/5
sandboxfrom a shallow merge) and aligns with the existing sandbox resolution flow (resolveSandboxConfigForAgentmerges global defaults with per-agent overrides). No other files are touched in the head SHA, and the change prevents a real configuration regression without affecting unrelated agent defaults.(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!