Skip to content

fix(sandbox): use materialized skill paths in startup prompts#91791

Merged
vincentkoc merged 3 commits into
openclaw:mainfrom
brokemac79:fix/issue-91761-sandbox-skill-prompt
Jun 10, 2026
Merged

fix(sandbox): use materialized skill paths in startup prompts#91791
vincentkoc merged 3 commits into
openclaw:mainfrom
brokemac79:fix/issue-91761-sandbox-skill-prompt

Conversation

@brokemac79

@brokemac79 brokemac79 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #91761.

  • Updates sandbox startup/command prompt skill handling so writable Docker/SSH sandbox sessions use the materialized sandbox skill paths under /workspace/.openclaw/sandbox-skills/skills/....
  • Stops sandbox prompt construction from falling back to host/global npm skill snapshots when sandbox metadata is available; missing sandbox metadata fails closed to no skill prompt instead of advertising unreadable host paths.
  • Carries the materialized skills workspace and SSH remote workdir through sandbox context so Docker and SSH prompt paths are resolved from the sandbox runtime, not the host install location.
  • Keeps the change scoped to sandbox prompt/context path handling and regression coverage; no changelog edit per CONTRIBUTING.md contributor guidance.

AI-assisted: yes.

Linked context

Closes #91761.

Related: #90410 and #90798. PR #90798 fixed materialization/readability; this PR fixes the remaining startup-context prompt path that could still advertise host skill paths.

Requested by maintainer/user follow-up on #91761 after the reporter confirmed they were already on v2026.6.5 with an npm global install on WSL and a Docker sandbox.

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: Docker/rw sandbox startup context should point bundled skills at sandbox-readable materialized paths such as /workspace/.openclaw/sandbox-skills/skills/.../SKILL.md, not host/global npm paths such as ~/.npm-global/lib/node_modules/openclaw/skills/gog/SKILL.md.
  • Real environment tested: local Docker Desktop-backed sandbox from the PR checkout on Windows/WSL host, using the Docker sandbox backend with node:24-bookworm, workspaceAccess=rw, containerWorkdir=/workspace, and PR head 801d8125773da1cde355e820a7dc51c1e2a13309.
  • Exact steps or command run after this patch:
$env:OPENCLAW_STATE_DIR='C:\oc-work\oc-91761\.tmp\live-proof-state'
node --import tsx .tmp\live-sandbox-context-proof.mjs
  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):
LIVE_SANDBOX_CONTEXT_PROOF=1
SANDBOX_RUNTIME=mode=all sandboxed=true
SANDBOX_WORKSPACE_INFO containerWorkdir=/workspace workspaceAccess=rw
SANDBOX_SKILLS_HOST_EXISTS=true
SKILLS_PROMPT_WORKSPACE=/workspace/.openclaw/sandbox-skills
SKILLS_WORKSPACE_ONLY=true
SKILLS_SNAPSHOT_SUPPRESSED=true
SHOULD_LOAD_SKILL_ENTRIES=true
LOADED_SKILL_NAMES=healthcheck
SKILLS_PROMPT_HAS_MATERIALIZED_PATH=true
SKILLS_PROMPT_HOST_PATH_MATCHES={"~/.npm-global":false,"/root/.npm-global":false,"/home/node/.npm-global":false,"/usr/local/lib/node_modules/openclaw/skills/gog/SKILL.md":false,"/usr/local/lib/node_modules/openclaw/skills/healthcheck/SKILL.md":false,"C:\\":false,"<host-workspace-forward-slash>":false,"<host-workspace-windows>":false}
MATERIALIZED_PATH=/workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md
RENDERED_SKILLS_PROMPT_EXCERPT_START
<available_skills>
  <skill>
    <name>healthcheck</name>
    <description>Audit/harden OpenClaw hosts: SSH, firewall, updates, exposure, backups, disk encryption, gateway security.</description>
    <location>/workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md</location>
    <version>sha256:518ec6e0482cf1c7</version>
  </skill>
</available_skills>
RENDERED_SKILLS_PROMPT_EXCERPT_END
RESOLVING_REAL_DOCKER_SANDBOX_CONTEXT=1
DOCKER_SANDBOX_RUNTIME_ID=openclaw-sbx-agent-main-proof-non-main-7a4ae438
DOCKER_SANDBOX_WORKDIR=/workspace
DOCKER_SANDBOX_READ_EXIT=0
DOCKER_SANDBOX_READ_STDOUT=READ_OK /workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md
LIVE_SANDBOX_CONTEXT_PROOF_OK=1

The live proof uses healthcheck because gog is correctly filtered out in a clean node:24-bookworm sandbox when the gog binary is unavailable. The exercised production path is the same sandbox prompt path mapper, and the negative checks still include the reporter's gog host/global path marker.

  • Local Linux Docker container test command also run after this patch:
docker run --rm \
  -v C:\oc-work\oc-91761:/workspace \
  -v oc91761_node_modules:/workspace/node_modules \
  -v oc91761_pnpm_store:/root/.local/share/pnpm/store \
  -w /workspace node:24-bookworm bash -lc '
    set -euo pipefail
    corepack enable
    env CI=1 OPENCLAW_HEAVY_CHECK_LOCK_SCOPE=worktree \
      NODE_OPTIONS=--max-old-space-size=4096 \
      OPENCLAW_TEST_PROJECTS_PARALLEL=6 \
      OPENCLAW_VITEST_MAX_WORKERS=1 \
      OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 \
      node scripts/run-vitest.mjs \
        src/auto-reply/reply/commands-system-prompt.test.ts \
        src/agents/embedded-agent-runner/sandbox-skills.test.ts \
        src/agents/sandbox.resolveSandboxContext.test.ts \
        src/agents/sandbox/ssh-backend.test.ts
  '
  • Focused regression evidence after fix:
✓ src/agents/sandbox.resolveSandboxContext.test.ts (10 tests)
✓ src/agents/embedded-agent-runner/sandbox-skills.test.ts (5 tests)
✓ src/agents/sandbox/ssh-backend.test.ts (8 tests)
✓ src/auto-reply/reply/commands-system-prompt.test.ts (7 tests)
[test] passed 2 Vitest shards in 26.44s

The new command-prompt regression asserts the sandbox prompt contains /workspace/.openclaw/sandbox-skills/skills/gog/SKILL.md, does not contain ~/.npm-global, and does not call the host reusable skill snapshot resolver for the sandboxed path.

  • Observed result after fix: sandbox prompt/context generation uses materialized sandbox skill entries for sandboxed runs, and the SSH backend workspace context test confirms SSH prompt paths use the remote workspace root rather than the host workdir.
  • What was not tested: the reporter's exact WSL Ubuntu npm-global installation with their live /new thread was not run.
  • Proof limitations or environment constraints: local live Docker sandbox proof covers the rendered startup/context prompt path and sandbox readability on the PR head; Blacksmith Testbox provides remote Linux CI-style changed-gate proof, not the reporter's exact workstation state.
  • Before evidence: [Bug]: Docker sandbox still advertises host skill paths in startup context on v2026.6.5 #91761 reports that on v2026.6.5, after openclaw sandbox recreate --all, Gateway restart, and /new, the model still tried to read host paths like ~/.npm-global/lib/node_modules/openclaw/skills/gog/SKILL.md even though skills were materialized in the sandbox directory.

Tests and validation

Focused commands run:

  • git diff --check -> passed.
  • Docker/Linux: corepack pnpm exec oxfmt --check --threads=1 <touched files> -> passed.
  • Docker/Linux: corepack pnpm exec oxlint --tsconfig config/tsconfig/oxlint.core.json <touched files> -> passed, 0 warnings/errors.
  • Docker/Linux: node scripts/run-vitest.mjs src/auto-reply/reply/commands-system-prompt.test.ts src/agents/embedded-agent-runner/sandbox-skills.test.ts src/agents/sandbox.resolveSandboxContext.test.ts src/agents/sandbox/ssh-backend.test.ts -> passed, 30 focused tests across 2 Vitest shards.
  • Live Docker sandbox prompt proof: node --import tsx .tmp\live-sandbox-context-proof.mjs with Docker backend node:24-bookworm -> passed; rendered <location>/workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md</location>, host/global path markers were false, and the Docker sandbox read returned READ_OK /workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md.
  • codex review --base origin/main -c service_tier='"fast"' -> passed; no high-confidence regressions found.
  • Blacksmith Testbox: tbx_01ktqfw0c7e184n5h5v69reb1b, warmup Actions run https://github.com/openclaw/openclaw/actions/runs/27245431845, reset checkout to origin/main, fetched brokemac79:fix/issue-91761-sandbox-skill-prompt, verified PR head 801d8125773da1cde355e820a7dc51c1e2a13309, then ran env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 CI=1 corepack pnpm check:changed -> passed with TESTBOX_EXIT=0.

Regression coverage added or updated:

  • src/auto-reply/reply/commands-system-prompt.test.ts: covers sandbox command prompts using /workspace/.openclaw/sandbox-skills/..., excluding host npm paths, and avoiding host skill snapshot fallback.
  • src/agents/embedded-agent-runner/sandbox-skills.test.ts: covers remapping sandbox skill prompts from materialized skill paths.
  • src/agents/sandbox.resolveSandboxContext.test.ts: covers materialized skills workspace metadata and SSH remote workspace path propagation.
  • src/agents/sandbox/ssh-backend.test.ts: existing coverage kept in the focused proof set for remote workspace behavior.

Risk checklist

Did user-visible behavior change? Yes.

Did config, environment, or migration behavior change? No.

Did security, auth, secrets, network, or tool execution behavior change? Yes, narrowly: sandboxed startup context now avoids advertising host/global skill paths and fails closed if sandbox skill metadata is unavailable.

Highest-risk area: prompt construction for sandboxed sessions, because an incorrect fallback can either leak unreadable host paths back into the model context or omit usable sandbox skill paths.

Risk mitigation: sandbox prompt generation is now explicitly split between sandbox and non-sandbox paths, sandbox tests assert both inclusion of materialized sandbox paths and exclusion of host paths, and non-sandbox behavior continues to use the existing reusable workspace skill snapshot path.

Current review state

Next action: maintainer review and CI.

Still waiting on: maintainer/security review for the sandbox/security-sensitive behavior, including explicit sandbox/security owner acceptance of the fail-closed empty skills prompt behavior when sandbox metadata cannot be resolved.

Bot/reviewer comments addressed: ClawSweeper proof ask for redacted live Docker sandbox rendered prompt path proof.

@brokemac79 brokemac79 requested a review from a team as a code owner June 10, 2026 00:15
@openclaw-barnacle openclaw-barnacle Bot added docker Docker and sandbox tooling agents Agent runtime and tooling size: M proof: supplied External PR includes structured after-fix real behavior proof. labels Jun 10, 2026
@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 10, 2026, 12:05 AM ET / 04:05 UTC.

Summary
The PR rebuilds sandbox startup/command skill prompts from materialized sandbox skill paths and threads sandbox skill workspace metadata plus SSH remote workdirs through sandbox workspace context.

PR surface: Source +130, Tests +118. Total +248 across 7 files.

Reproducibility: yes. at source level: current main still builds the command/context skills prompt from the host reusable snapshot while sandboxed, and the linked issue reports a released Docker reproduction, but I did not run the live current-main failure locally.

Review metrics: 2 noteworthy metrics.

  • Sandbox fallback behavior: 1 fail-closed path changed. Missing sandbox metadata now omits the skills prompt instead of falling back to host paths, so owner acceptance matters before merge.
  • Changed surface: 4 source files and 3 test files. The patch spans prompt construction, sandbox context metadata, SSH path exposure, and focused regression coverage.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Get sandbox/security owner acceptance for the fail-closed empty skills prompt behavior.
  • [P2] Decide whether OpenShell prompt-path proof is required before merge or tracked separately.

Risk before merge

  • [P1] Sandboxed command prompt generation now fails closed to an empty skills prompt when sandbox workspace metadata cannot be resolved; a sandbox/security owner should explicitly accept that degraded behavior before merge.
  • [P1] The provided live proof covers Docker and the tests cover SSH workdir calculation; OpenShell shares the sandbox backend invariant but remains outside this PR's live proof and may need separate follow-up if maintainers expect all backends covered now.

Maintainer options:

  1. Accept fail-closed sandbox prompts (recommended)
    A sandbox/security owner confirms that omitting skills is preferable to advertising unreadable host paths when sandbox metadata is unavailable.
  2. Add degraded-state visibility
    If maintainers want operator visibility, add a warning or context-report note for the empty sandbox skills prompt without restoring host-path fallback.
  3. Pause for all-backend proof
    Hold the PR until maintainers decide whether OpenShell needs the same prompt-path proof before this Docker/SSH fix lands.

Next step before merge

  • [P1] Human sandbox/security review should decide the fail-closed behavior and backend proof scope; there is no concrete automation repair to queue.

Security
Cleared: No dependency, workflow, secret, or supply-chain changes were introduced; the sandbox boundary behavior is tracked as merge risk rather than a concrete security regression.

Review details

Best possible solution:

Merge only after sandbox/security owner acceptance of the fail-closed prompt behavior, with OpenShell scope accepted as out-of-scope or tracked separately.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main still builds the command/context skills prompt from the host reusable snapshot while sandboxed, and the linked issue reports a released Docker reproduction, but I did not run the live current-main failure locally.

Is this the best way to solve the issue?

Yes for the Docker/SSH bug path: the PR reuses the existing embedded-run sandbox skill prompt mapping instead of inventing a second renderer, with the remaining decision limited to fail-closed behavior and backend scope.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against bf89552e6783.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body and follow-up comment provide after-fix terminal proof showing the rendered Docker sandbox prompt path, host-path exclusion, real Docker read success, focused tests, and passing Testbox changed gate.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body and follow-up comment provide after-fix terminal proof showing the rendered Docker sandbox prompt path, host-path exclusion, real Docker read success, focused tests, and passing Testbox changed gate.
  • remove rating: 🌊 off-meta tidepool: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.

Label justifications:

  • P1: The PR targets a released Docker workspaceAccess: "rw" sandbox workflow where startup context can still point skills at unreadable host paths.
  • merge-risk: 🚨 security-boundary: The diff changes sandbox skill-path visibility and fail-closed prompt behavior at the sandbox boundary.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body and follow-up comment provide after-fix terminal proof showing the rendered Docker sandbox prompt path, host-path exclusion, real Docker read success, focused tests, and passing Testbox changed gate.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body and follow-up comment provide after-fix terminal proof showing the rendered Docker sandbox prompt path, host-path exclusion, real Docker read success, focused tests, and passing Testbox changed gate.
Evidence reviewed

PR surface:

Source +130, Tests +118. Total +248 across 7 files.

View PR surface stats
Area Files Added Removed Net
Source 4 172 42 +130
Tests 3 121 3 +118
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 7 293 45 +248

Acceptance criteria:

  • [P1] Contributor-reported: node scripts/run-vitest.mjs src/auto-reply/reply/commands-system-prompt.test.ts src/agents/embedded-agent-runner/sandbox-skills.test.ts src/agents/sandbox.resolveSandboxContext.test.ts src/agents/sandbox/ssh-backend.test.ts passed.
  • [P1] Contributor-reported: live Docker sandbox prompt proof rendered /workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md and read it successfully.
  • [P1] Contributor-reported: Testbox tbx_01ktqfw0c7e184n5h5v69reb1b passed corepack pnpm check:changed.

What I checked:

Likely related people:

  • brokemac79: Authored the merged sandbox skills materialization fix in 3b6bcbfb and the current PR, so they have recent exact-area implementation context beyond merely opening this PR. (role: recent feature contributor; confidence: high; commits: 3b6bcbfb5045, 801d8125773d; files: src/agents/embedded-agent-runner/sandbox-skills.ts, src/agents/sandbox/context.ts, src/auto-reply/reply/commands-system-prompt.ts)
  • scotthuang: Current shallow blame attributes the existing command prompt snapshot path and sandbox workspace-info helper to 696c1ecd; the grafted history makes this a routing hint rather than firm ownership. (role: recent area contributor; confidence: medium; commits: 696c1ecd2068; files: src/auto-reply/reply/commands-system-prompt.ts, src/agents/sandbox/context.ts, src/agents/embedded-agent-runner/sandbox-skills.ts)
  • Shakker: Committed 696c1ecd, which current shallow history shows across the affected agent/sandbox prompt files. (role: recent integration committer; confidence: low; commits: 696c1ecd2068; files: src/auto-reply/reply/commands-system-prompt.ts, src/agents/sandbox/context.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. labels Jun 10, 2026
@brokemac79

Copy link
Copy Markdown
Contributor Author

Added Blacksmith Testbox proof for the current PR head.

  • PR head tested: 801d8125773da1cde355e820a7dc51c1e2a13309
  • Testbox: tbx_01ktqfw0c7e184n5h5v69reb1b
  • Warmup Actions run: https://github.com/openclaw/openclaw/actions/runs/27245431845
  • Remote checkout: reset to origin/main, fetched brokemac79:fix/issue-91761-sandbox-skill-prompt, checked out the PR head, and verified the SHA.
  • Command: env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 CI=1 corepack pnpm check:changed
  • Result: passed with TESTBOX_EXIT=0.

The local Windows wrapper/sync path still hits the known Blacksmith rsync/control-socket issue, so this used the direct SSH path to the warmed Testbox described in the repo maintainer tooling notes. The box was stopped and the ephemeral SSH key was deleted after proof.

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed proof: supplied External PR includes structured after-fix real behavior proof. labels Jun 10, 2026
@brokemac79

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

I updated the PR body with redacted live Docker sandbox proof for the rendered startup/context skills prompt:

  • SANDBOX_RUNTIME=mode=all sandboxed=true
  • rendered <location>/workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md</location>
  • host/global path markers including ~/.npm-global and /usr/local/lib/node_modules/openclaw/skills/gog/SKILL.md are false
  • real Docker sandbox read returned READ_OK /workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md

The body also explicitly notes the remaining maintainer/security-owner acceptance gate for the fail-closed empty skills prompt behavior.

@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 10, 2026
@brokemac79

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

I updated the PR body with redacted live Docker sandbox proof for the rendered startup/context skills prompt, and the repo Real behavior proof check now passes:

  • SANDBOX_RUNTIME=mode=all sandboxed=true
  • rendered <location>/workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md</location>
  • host/global path markers including ~/.npm-global and /usr/local/lib/node_modules/openclaw/skills/gog/SKILL.md are false
  • real Docker sandbox read returned READ_OK /workspace/.openclaw/sandbox-skills/skills/healthcheck/SKILL.md

The body also explicitly notes the remaining maintainer/security-owner acceptance gate for the fail-closed empty skills prompt behavior.

@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 10, 2026
@brokemac79

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels Jun 10, 2026
@vincentkoc vincentkoc force-pushed the fix/issue-91761-sandbox-skill-prompt branch from 801d812 to 2553631 Compare June 10, 2026 14:35
@vincentkoc

Copy link
Copy Markdown
Member

Maintainer fixes and pre-merge proof are complete.

What changed after autoreview:

  • added backend-owned workdir resolution for Docker, SSH, OpenShell, and future registered backends
  • preserved the shipped fallback for third-party backends that have not adopted the optional resolver
  • added focused registry, custom-backend, and prompt-path regression coverage

Verification on head 25536314fe40fb47bec78222831862ec4c79b693:

  • node scripts/run-vitest.mjs src/agents/sandbox/backend.test.ts src/agents/sandbox.resolveSandboxContext.test.ts src/agents/embedded-agent-runner/sandbox-skills.test.ts src/auto-reply/reply/commands-system-prompt.test.ts — 26 tests passed
  • .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main ... — clean; no accepted/actionable findings
  • remote check:changed via Crabbox Azure cbx_745eb13bffbc (violet-shrimp) — passed

Known proof gap: the original reporter's exact Docker installation was not re-run manually; source-level regression coverage and the remote changed gate cover the corrected prompt path.

@vincentkoc vincentkoc merged commit b71d8e1 into openclaw:main Jun 10, 2026
@openclaw-barnacle openclaw-barnacle Bot added extensions: openshell and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. labels Jun 10, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 11, 2026
…aw#91791)

* fix(sandbox): use materialized skill paths in command prompts

* fix(sandbox): resolve backend prompt workdirs

* fix(sandbox): preserve custom backend prompt fallback

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
@gbb-netizen

gbb-netizen commented Jun 12, 2026

Copy link
Copy Markdown

@brokemac79 Just checking - is this meant to be landed in 2026.6.6? It was just released, but I'm not sure if it included this fix. Below is after recreating the docker sandbox via openclaw sandbox recreate --all and using /new after a gateway restart after updating to 2026.6.6. For me it's still trying to access skills in inaccessible home directories after these steps.

image

brokemac79 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

@gbb-netizen Thanks for the fresh 2026.6.6 repro. We are going to work this as a follow-up fix and will comment back here with the new PR reference when it is ready.

Current read of the evidence:

  • fix(sandbox): use materialized skill paths in startup prompts #91791 is included in v2026.6.6, but the screenshot still shows a fresh /new Docker sandbox session trying to read the host npm-global path /home/tzdai/.npm-global/lib/node_modules/openclaw/skills/gog/SKILL.md.
  • That is the same class of bug as [Bug]: Docker sandbox still advertises host skill paths in startup context on v2026.6.5 #91761: sandbox startup context should advertise materialized sandbox-readable skill paths under /workspace/.openclaw/sandbox-skills/skills/..., not host install paths.
  • This does not look like a normal doctor cleanup case. The existing session snapshot doctor repair targets stale cached runtime paths and would likely rewrite to the current bundled install path, which for this install is still the host npm-global path. It would not naturally rewrite a sandboxed run prompt to /workspace/.openclaw/sandbox-skills/....
  • The likely remaining gap is that fix(sandbox): use materialized skill paths in startup prompts #91791 fixed the command/system-prompt inspection path, but the normal agent startup path still builds/persists skillsSnapshot from resolveReusableWorkspaceSkillSnapshot in src/agents/agent-command.ts. That path better matches a real /new run.

Planned PR shape:

  • Add a regression test around the actual agent command/run path for Docker workspaceAccess: "rw", sandbox mode: "all", and a dashboard-style /new session key.
  • Assert the prompt/snapshot passed into the run contains /workspace/.openclaw/sandbox-skills/....
  • Assert it does not contain ~/.npm-global, /home/.../node_modules/openclaw/skills/..., or other host bundled skill paths.
  • Reuse the sandbox skill runtime mapping added in fix(sandbox): use materialized skill paths in startup prompts #91791 for the real startup snapshot path, keeping non-sandbox behavior unchanged.

We will still treat the reporter's environment as useful confirmation, but there is enough source-level and screenshot evidence to start a focused follow-up PR.

Copy link
Copy Markdown
Contributor Author

Follow-up PR is now open: #92508.

Shape: the CLI run startup prompt path now rebuilds skill prompt entries from the sandbox materialized skills workspace when the session is sandboxed, instead of reusing the persisted host-path skill snapshot. That matches the evidence in the screenshot: #91791 landed in v2026.6.6, but it covered the prompt-inspection path, while CLI-backed startup prompts could still advertise /home/.../node_modules/openclaw/skills/... to the model.

Proof in the PR includes a regression test starting from a persisted /home/tzdai/.npm-global/lib/node_modules/openclaw/skills/gog/SKILL.md snapshot and asserting the CLI system prompt uses /workspace/.openclaw/sandbox-skills/skills/gog/SKILL.md instead.

@brokemac79

Copy link
Copy Markdown
Contributor Author

Hi @gbb-netizen - the new PR #92508 has now been merged into main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docker Docker and sandbox tooling extensions: openshell merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. P1 High-priority user-facing bug, regression, or broken workflow. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: M status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Docker sandbox still advertises host skill paths in startup context on v2026.6.5

3 participants