Skip to content

fix(cron): exclude sandbox from shallow merge in isolated agent config#14556

Closed
seheepeak wants to merge 1 commit intoopenclaw:mainfrom
seheepeak:fix/cron-sandbox-shallow-merge
Closed

fix(cron): exclude sandbox from shallow merge in isolated agent config#14556
seheepeak wants to merge 1 commit intoopenclaw:mainfrom
seheepeak:fix/cron-sandbox-shallow-merge

Conversation

@seheepeak
Copy link
Copy Markdown
Contributor

@seheepeak seheepeak commented Feb 12, 2026

Summary

  • Bug: When a per-agent sandbox config contains only docker overrides (e.g. workdir, network, binds), the Object.assign shallow merge in runCronIsolatedAgentTurn replaces the entire agents.defaults.sandbox — losing mode, scope, and workspaceAccess. This causes resolveSandboxConfigForAgent to fall back to mode: "off", resulting in cron isolated agents executing directly on the host instead of inside the sandbox container.
  • Fix: Exclude sandbox from the destructured agent config override. Sandbox resolution is already handled separately by resolveSandboxConfigForAgent, which properly merges global defaults with per-agent overrides via agents.list[].sandbox.

Reproduction

  1. Set agents.defaults.sandbox.mode: "all" and scope: "agent"
  2. Add a per-agent sandbox.docker override (e.g. custom workdir, network, or binds) in agents.list[]
  3. Create a cron job with sessionTarget: "isolated" for that agent
  4. Run the cron job and execute hostname && whoami — it prints the host hostname and user instead of the container's

Root cause

src/cron/isolated-agent/run.ts:130-136

Object.assign({},
  params.cfg.agents.defaults,       // sandbox: { mode:"all", scope:"agent", ... }
  agentOverrideRest,                // sandbox: { docker: { ... } }  ← overwrites entirely
)
// Result: sandbox.mode is gone → defaults to "off"

Related: #4171

Greptile Overview

Greptile Summary

This change updates the cron isolated-agent runner to avoid shallow-merging a per-agent sandbox override into agents.defaults.

Previously, runCronIsolatedAgentTurn built cfgWithAgentDefaults by doing a top-level Object.assign({}, agents.defaults, agentOverrideRest). If the per-agent override included only sandbox.docker keys, that shallow merge replaced the entire sandbox object and dropped mode/scope/workspaceAccess, causing downstream sandbox resolution to fall back to mode: "off" and run cron “isolated” turns on the host.

The fix destructures sandbox out of the per-agent override before the shallow merge, relying on the existing resolveSandboxConfigForAgent path (which merges agents.defaults.sandbox with agents.list[].sandbox field-by-field) to handle sandbox configuration correctly.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk.
  • The commit is narrowly scoped (excludes sandbox from a shallow merge) and aligns with the existing sandbox resolution flow (resolveSandboxConfigForAgent merges global defaults with per-agent overrides). No other files are touched in the head SHA, and the change prevents a real configuration regression without affecting unrelated agent defaults.
  • src/cron/isolated-agent/run.ts

(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!

@openclaw-barnacle openclaw-barnacle Bot added scripts Repository scripts docker Docker and sandbox tooling size: S labels Feb 12, 2026
@seheepeak seheepeak force-pushed the fix/cron-sandbox-shallow-merge branch from 244e19e to f54da17 Compare February 12, 2026 09:08
@openclaw-barnacle openclaw-barnacle Bot added size: XS and removed scripts Repository scripts docker Docker and sandbox tooling size: S labels Feb 12, 2026
When a per-agent sandbox config (e.g. docker overrides) exists in
agents.list[].sandbox, the Object.assign shallow merge in
runCronIsolatedAgentTurn overwrites the entire agents.defaults.sandbox
object — losing mode, scope, and workspaceAccess fields. This causes
resolveSandboxConfigForAgent to fall back to mode:"off", resulting in
cron isolated agents executing on the host instead of inside the
sandbox container.

Fix: exclude sandbox from the destructured agent config override.
Sandbox resolution is already handled separately by
resolveSandboxConfigForAgent, which properly merges global defaults
with per-agent overrides.

Closes openclaw#4171
@seheepeak seheepeak force-pushed the fix/cron-sandbox-shallow-merge branch from f54da17 to 99517e4 Compare February 12, 2026 09:11
frodo-harborbot added a commit to harborworks/openclaw that referenced this pull request Feb 15, 2026
When a per-agent sandbox config contains only docker overrides (e.g. workdir,
network, binds), the Object.assign shallow merge replaces the entire
agents.defaults.sandbox — losing mode, scope, and workspaceAccess. This causes
resolveSandboxConfigForAgent to fall back to mode: 'off', resulting in cron
isolated agents executing directly on the host instead of inside the sandbox.

Exclude sandbox from the destructured agent config override since sandbox
resolution is already handled separately by resolveSandboxConfigForAgent,
which properly merges global defaults with per-agent overrides.

Fixes openclaw#4171
Related: openclaw#14556
@jbencook
Copy link
Copy Markdown

We're running into this exact issue on our multi-agent setup (gateway in Docker, sandbox.mode: "all", scope: "agent", per-agent docker.binds overrides). Cron isolated sessions were silently falling back to exec inside the gateway container because the shallow merge wiped sandbox.mode.

Spent a few hours debugging this today — the symptoms are confusing because sandbox containers exist but stay stopped, and exec just falls back to in-gateway with no error. The only signal is that Docker service names resolve but localhost doesn't (or vice versa depending on your network setup).

This fix is correct. resolveSandboxConfigForAgent already handles the merge properly downstream — the Object.assign in runCronIsolatedAgentTurn just needs to stop clobbering it.

Would love to see this merged — happy to help test if needed.

frodo-harborbot added a commit to harborworks/openclaw that referenced this pull request Feb 16, 2026
…openclaw#69)

* fix: sandbox mount paths + DNS for gateway-in-Docker

- Add per-agent sandbox binds using HOST_WORKSPACE_DIR (host paths)
  so sandbox containers can read workspace files
- Add explicit DNS (8.8.8.8) to daemon and gateway containers
  to avoid host systemd-resolved failures

* feat: enable heartbeat (15m), add message tool to sandbox, fix telegram bot token fallback

- Set default heartbeat interval to 15m (was 0m/disabled)
- Add 'message' to sandbox tool allow list so agents can send Telegram messages
- Set top-level botToken fallback from first agent's token to prevent
  'Telegram bot token missing' error when agents use message tool

* revert: remove message tool and botToken fallback

Keep scope to heartbeat enablement only. Task comms will go through
the daemon task API instead of direct agent messaging.

* feat: task API v1 — daemon endpoints + Convex backend

Minimum viable task API for agent heartbeat integration:

Convex:
- Updated task schema: to_do/in_progress/review/done/blocked statuses,
  single assigneeId (required), reviewerIds array, per-reviewer status
- Added tasks.ts: listByAgent, getWithMessages, pickup, addMessage
- Added HTTP routes: /api/daemon/tasks, tasks/get, tasks/pickup, tasks/message
- Added template update endpoint: /api/daemon/templates/update
- Added seed:createTestTask for testing
- Schema compat: added browserEnabled to harbors, reviewers to tasks

Daemon:
- New tasks.ts module: HTTP proxy from agent sandbox to Convex
- Routes: GET /tasks, GET /tasks/:id, POST /tasks/:id/pickup, POST /tasks/:id/message
- Agent identification via X-Agent-ID header

Templates:
- Updated HEARTBEAT.md to use daemon task API (curl commands)
- Agents check localhost:4747/tasks on wake, pick up highest priority to_do

* feat: Kanban board UI for task management

- Full-width Kanban board with columns: To Do, In Progress, Review, Done, Blocked
- Task cards show title, priority badge (color-coded), assignee name
- Create task modal: title, description, priority, assignee, reviewers (multi-select)
- Task detail modal: full info + message timeline
- Convex: list/create/getMessages queries for frontend
- Route: /tasks added to harbor routes (first in sidebar)

* fix: reactive task detail modal + markdown rendering

- Task detail modal now reads live data from Convex query instead of
  stale state snapshot — status updates appear in real time
- Added react-markdown for rendering description and message content
- Markdown styles for code blocks, lists, headings

* ui: rename Blocked column to Waiting

* feat: review flow — submit, approve, request changes

Convex:
- tasks.submit: in_progress → review, sets reviewer statuses to pending
- tasks.review: approve or request changes per reviewer
  - All approved → done
  - Any changes_requested → back to in_progress, reset all statuses
- HTTP routes for /api/daemon/tasks/submit and /api/daemon/tasks/review

Daemon:
- POST /tasks/:id/submit and /tasks/:id/review endpoints

HEARTBEAT.md:
- Updated template with assignee vs reviewer instructions
- Agents now know how to submit work and review others' work

* fix: daemon preserves per-agent heartbeat config during sync

* fix: prevent assignee from being a reviewer

- Convex: filter assigneeId out of reviewerIds on create
- UI: hide assignee from reviewer checklist, clear if assignee changes

* feat: block/unblock flow with required human message

- Agents can block tasks via POST /tasks/:id/block with a reason
- Task stores previousStatus for return after unblock
- Human unblocks in UI with required message explaining what changed
- Unblock returns task to previous status
- Assignee cannot be a reviewer (enforced in Convex + UI)

* feat: daemon-managed isolated cron heartbeats

Replace built-in heartbeat with isolated cron sessions to prevent
pattern lock. Each heartbeat runs in a fresh session — no history,
no HEARTBEAT_OK muscle memory.

- Add daemon/src/cron.ts: syncCronJobs creates/removes heartbeat
  cron jobs per agent via gateway RPC (idempotent)
- Wire syncCronJobs into daemon tick loop
- Disable built-in heartbeat in default config (every: 0m)
- Configurable interval via HEARTBEAT_INTERVAL_MS env var

* fix: heartbeat interval to 15m, add block/waiting instructions to HEARTBEAT.md

- Default heartbeat interval changed from 60s to 15m (900000ms)
- HEARTBEAT.md now instructs agents to use POST /tasks/:id/block when
  stuck or needing human input instead of leaving task in_progress
- Also added submit-for-review instructions to HEARTBEAT.md

* fix: use Docker service name for daemon URL in cron heartbeat

The exec tool runs inside the gateway container (bridge network), not
the sandbox (host network). localhost:4747 doesn't resolve to the daemon
from the gateway container. Use 'daemon:4747' (Docker service name) instead.

Also improved heartbeat message with explicit MUST-use-exec instruction
and configurable DAEMON_INTERNAL_URL env var for non-Docker deployments.

* fix(daemon): use localhost for daemon URL in heartbeat messages

Sandbox containers use host networking, so the Docker Compose service
name 'daemon' doesn't resolve. Use localhost:4747 instead since the
daemon port is published to 127.0.0.1:4747.

* fix: unblock message attribution with fromLabel field

- Add optional fromLabel field to messages schema
- Set fromLabel='Admin User' on unblock messages instead of '(human)' prefix
- Display fromLabel when present in frontend message rendering
- Add message timestamps and header layout in task detail view

* feat: leader roles can create tasks via daemon API

- Add POST /tasks/create daemon endpoint with leader role validation
- Add POST /api/daemon/tasks/create Convex HTTP route
- Add internalCreateTask mutation resolving sessionKeys to agent IDs
- Pass agents list to handleTaskRequest for role lookup
- Update heartbeat message with task creation instructions for leaders
- Leader roles: project-manager, executive-assistant

* feat: activity timestamps on tasks and messages

- Add updatedAt field to tasks schema, set on every status change
- Show relative time (e.g. '2h ago') on task cards and detail view
- Show timestamps on each message in task detail
- Show 'Updated Xm ago' on task cards when updatedAt is set

* fix(frontend): style timestamp elements to match project conventions

Add CSS for kanban-card-time, kanban-message-header, and
kanban-message-time to use consistent 0.75rem/muted styling.

* fix(frontend): remove capitalize on detail values, add 'Created' prefix on card timestamps

* fix(tasks): skip review and go straight to done when no reviewers assigned

* fix(frontend): tighten markdown paragraph spacing, capitalize status values

* fix(frontend): hide empty paragraphs, cap modal height with scroll, tighten list spacing

* fix(frontend): remove pre-wrap from description text, let ReactMarkdown handle spacing

* fix(daemon): update existing cron jobs when heartbeat interval changes

Include interval in sync fingerprint so changes trigger re-sync.
Update existing jobs via cron.update instead of only creating new ones.

* feat(tasks): sort by priority then oldest first

Query returns tasks sorted urgent > high > medium > low, then by
creation time. Heartbeat prompt instructs agents to pick first to_do.

* fix(frontend): sticky sidebar and column headers, scrollable task area

App layout uses fixed viewport height. Sidebar stays fixed,
column headers stick to top while task cards scroll.

* fix(frontend): split kanban into fixed headers + scrollable card area

Headers row sits outside scroll container so they stay pinned.
Board area scrolls independently.

* fix(frontend): prevent app-main from scrolling, only kanban-board scrolls

* fix(frontend): fix flex chain for kanban scroll - use flex: 1 1 0 and overflow hidden on wrapper

* fix(frontend): use sticky headers within scrollable app-main, remove wrapper

Simpler approach - kanban-headers are sticky within the naturally
scrolling app-main container.

* fix(frontend): add padding to sticky headers to prevent overlap with cards

* fix(frontend): use correct background color (color-navy) for sticky headers

* fix(frontend): extend sticky header coverage with box-shadow and more padding

* fix(frontend): drop sticky headers, keep sidebar fixed + content scrolling

* feat(tasks): only show done tasks from the last 24 hours on the board

* fix(tasks): use updatedAt for done task 24h filter

* feat: add delete task button and heartbeat error handling

- Add error handling hint to heartbeat message (reply HEARTBEAT_OK if curl fails)
- Add tasks.remove mutation that deletes task and all its messages
- Add Delete Task button to TaskDetailModal with confirm dialog

* fix(frontend): styled delete task with inline confirmation matching AgentsPage pattern

* fix(daemon): add review instructions to heartbeat prompt

Agents now know to review tasks where _agentRole is 'reviewer'
and approve or request changes.

* fix(daemon): revert to daemon:4747 — cron exec runs in gateway container, not sandbox

Sandbox containers are not being created for cron sessions despite
mode=all config. Exec falls back to running inside the gateway
container which is on the compose bridge network.

* fix(daemon): restore localhost:4747 — sandbox uses host networking for cron sessions

* feat(frontend): sort Done column by newest first

* feat(daemon): auto-detect and force-run stale cron jobs, update payload on sync

* fix(daemon): use daemon:4747 — sandbox containers don't start for cron sessions

* feat: disable sandboxing — use localhost:4747, guard sandbox config injection

* feat: disable sandboxing — set mode off in default config

* fix(daemon): use daemon:4747 — exec runs inside gateway container when sandbox is off

* feat: build gateway from openclaw fork with cron sandbox fix

- Add harborworks/openclaw as submodule (branch: fix/cron-sandbox-shallow-merge)
- Build gateway image from source instead of npm
- Re-enable sandbox (mode: all, scope: agent)
- Switch DAEMON_URL back to localhost:4747 (sandbox uses host networking)

Fixes the cron sandbox bug where Object.assign shallow merge clobbered
sandbox.mode, causing cron sessions to run unsandboxed.
See: openclaw#4171, openclaw#14556

* fix: pin openclaw submodule to v2026.2.13 + cron sandbox fix

Build gateway from source (v2026.2.13 tag + sandbox shallow merge fix)
instead of npm to avoid auth scope incompatibility with newer versions.

Sandbox mode: all, DAEMON_URL: localhost:4747 (host networking)

* fix: remove unused internalMutation import (fixes CI type check)

* fix(ci): fetch submodules in deploy, fix gateway build context, use HTTPS for submodule URL
@seheepeak
Copy link
Copy Markdown
Contributor Author

Gentle ping — any chance this could get a review? CI is green and mergeable. Happy to address any feedback.

@vincentkoc
Copy link
Copy Markdown
Member

Closing as superseded by #4226.

Issue #4171 is already closed against #4226, so this PR is no longer the active resolution for that issue.

@vincentkoc
Copy link
Copy Markdown
Member

Superseded by #4226; issue #4171 is already closed against that PR.

@vincentkoc vincentkoc closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants