Skip to content

fix(auto-reply): stage sandboxed workspace media#86531

Merged
steipete merged 3 commits into
openclaw:mainfrom
mjamiv:codex/74061-workspace-media-staging
May 26, 2026
Merged

fix(auto-reply): stage sandboxed workspace media#86531
steipete merged 3 commits into
openclaw:mainfrom
mjamiv:codex/74061-workspace-media-staging

Conversation

@mjamiv

@mjamiv mjamiv commented May 25, 2026

Copy link
Copy Markdown
Contributor

Fixes #74061.

Summary

  • Stages absolute media paths that are already inside the agent workspace before trying sandbox path translation.
  • Keeps absolute host-local paths outside the workspace blocked when sandbox mapping fails.
  • Keeps host-read HTML out of this PR: .html remains denied under hostReadCapability pending a separate security-boundary review, and extension-declared .html files fail closed instead of falling through to binary/media sniffing.
  • Adds an explicit WhatsApp QA scenario return type so the existing RTT proof object keeps its literal request-to-observed-message source type and production typecheck stays green on current main.

Real behavior proof

  • Behavior or issue addressed: Telegram/local final-reply MEDIA: paths that point to files inside the agent workspace can be dropped when a sandbox workspace exists because sandbox path translation runs before the normal local staging path. The reported repro includes /home/openclaw/.openclaw/workspace/.../screenshot.png and other local workspace files receiving only Media failed.

  • Real environment tested: Current OpenClaw PR worktree for codex/74061-workspace-media-staging on Ubuntu 24.04 with repository dependencies already installed.

  • Exact steps or command run after this patch:

pnpm exec vitest run --config test/vitest/vitest.media.config.ts src/auto-reply/reply/reply-media-paths.test.ts src/media/web-media.test.ts
node scripts/test-projects.mjs src/auto-reply/reply/reply-media-paths.test.ts src/media/web-media.test.ts
pnpm tsgo:prod
pnpm check:test-types
git diff --check
pnpm run test:extensions:package-boundary:compile
pnpm run test:extensions:package-boundary:canary
  • Evidence after fix:
RUN  v4.1.7 /tmp/openclaw-74061.LyunfH
Test Files  1 passed (1)
Tests  66 passed (66)
[test] starting test/vitest/vitest.media.config.ts
[test] starting test/vitest/vitest.auto-reply.config.ts
[test] passed 2 Vitest shards in 10.22s
extension package boundary check passed
mode: compile
compiled plugins: 108
canary plugins: 0
extension package boundary check passed
mode: canary
compiled plugins: 0
canary plugins: 2
  • Observed result after fix: The reply media normalizer now stages /Users/peter/.openclaw/workspace/reports/screenshot.png to /tmp/outbound-media/screenshot.png even when ensureSandboxWorkspaceForSession() returns a sandbox workspace, instead of dropping the path before delivery. The outside-workspace host path case remains blocked.

  • Security-boundary result after narrowing: text/html was removed from the host-read document allowlist. Valid local .html files now reject with path-not-allowed, and binary-disguised .html files are explicitly blocked before media sniffing can treat them as another allowed media type. HTML host-read support should be reviewed separately if maintainers want it.

  • What was not tested: I did not send a live Telegram Bot API message from the reporter's environment because those channel credentials are not available in this checkout. The proof exercises the production normalizer and host-read boundary code paths in the repository test harness.

@openclaw-barnacle openclaw-barnacle Bot added size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 25, 2026, 6:51 PM ET / 22:51 UTC.

Summary
The branch stages absolute reply-media paths that resolve inside the agent workspace before sandbox media translation, adds regression coverage, keeps host-read HTML denied/fail-closed, and preserves a WhatsApp QA return type.

PR surface: Source +24, Tests +42. Total +66 across 5 files.

Reproducibility: yes. by source inspection: current main attempts sandbox media translation before absolute workspace staging when a sandbox root exists, which can drop a valid host workspace path. The PR adds focused regression coverage for that exact path.

Review metrics: 1 noteworthy metric.

  • Host-read MIME allowance: 0 MIME types added; .html explicitly denied. This confirms the narrowed PR no longer expands the host-read HTML allowlist while still touching a media security boundary.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Mantis proof suggestion
A Telegram Desktop proof would add useful maintainer-visible evidence for the user-facing media delivery path that local normalizer proof cannot fully show. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

telegram desktop proof: verify that a final MEDIA path inside the workspace sends the file in Telegram instead of Media failed.

Risk before merge

  • Merging changes the sandbox/local-file ordering so absolute paths inside the agent workspace are staged before sandbox translation; maintainers should explicitly accept that workspace-only boundary behavior.
  • The PR body provides production-normalizer and host-read proof, but it does not include a live Telegram delivery showing the reporter-visible path changes from Media failed to a delivered file.

Maintainer options:

  1. Accept scoped workspace staging
    Merge if maintainers are comfortable that only absolute paths resolving inside the agent workspace are staged early, while outside host paths, file URLs, and HTML host reads remain blocked.
  2. Request live Telegram proof first
    Ask for Mantis or equivalent live Telegram proof that a generated workspace media file is delivered instead of producing Media failed before merging.

Next step before merge
No automated repair is needed; maintainers should decide whether the scoped sandbox/local-file boundary and available proof are sufficient before merge.

Security
Cleared: No concrete security or supply-chain defect was found; the patch narrows HTML host-read handling and keeps the remaining workspace-staging change visible as merge risk.

Review details

Best possible solution:

Land the scoped workspace-media staging fix after maintainer acceptance of the security boundary and any desired live Telegram proof, while keeping HTML host-read support for a separate review.

Do we have a high-confidence way to reproduce the issue?

Yes, by source inspection: current main attempts sandbox media translation before absolute workspace staging when a sandbox root exists, which can drop a valid host workspace path. The PR adds focused regression coverage for that exact path.

Is this the best way to solve the issue?

Yes, the patch is a narrow fix because it stages only paths that resolve under the agent workspace, preserves outside-host-path blocking, and keeps HTML host-read denied. The remaining question is maintainer acceptance of the boundary and live Telegram proof, not a code repair.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 00f98095316a.

Label changes

Label changes:

  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

  • P1: The linked regression breaks a real Telegram final-reply media workflow by dropping local workspace attachments.
  • merge-risk: 🚨 security-boundary: The patch intentionally changes when workspace-local absolute paths can be read and staged before sandbox path translation.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body and follow-up comments provide after-fix terminal proof for the production normalizer and host-read boundary, though not a live Telegram delivery from the reporter environment.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body and follow-up comments provide after-fix terminal proof for the production normalizer and host-read boundary, though not a live Telegram delivery from the reporter environment.
  • mantis: telegram-visible-proof: Mantis should capture Telegram visible proof. The PR changes visible Telegram final-reply media behavior, so a short Telegram Desktop proof could directly show the file delivered instead of Media failed.
Evidence reviewed

PR surface:

Source +24, Tests +42. Total +66 across 5 files.

View PR surface stats
Area Files Added Removed Net
Source 3 30 6 +24
Tests 2 42 0 +42
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 5 72 6 +66

What I checked:

  • Current-main behavior: Current main resolves a sandbox root and attempts sandbox media translation before falling back to non-sandbox absolute local staging, so an absolute host workspace path can be dropped when a sandbox workspace exists. (src/auto-reply/reply/reply-media-paths.ts:162, 00f98095316a)
  • PR fix path: The PR adds resolveAbsoluteWorkspaceMedia before sandbox translation and only stages the path when it resolves under params.workspaceDir; outside workspace paths continue to fall through to the existing sandbox/blocking path. (src/auto-reply/reply/reply-media-paths.ts:147, 6b1fe51ee13b)
  • Regression coverage: The PR adds a focused test proving an absolute workspace media path is staged before sandbox mapping when ensureSandboxWorkspaceForSession returns a sandbox workspace. (src/auto-reply/reply/reply-media-paths.test.ts:255, 6b1fe51ee13b)
  • HTML boundary narrowed: The PR keeps text/html out of HOST_READ_ALLOWED_DOCUMENT_MIMES and adds a declared-text fail-closed guard for .html instead of adding host-read HTML support. (src/media/web-media.ts:138, 6b1fe51ee13b)
  • Merge-result scope: The available GitHub merge object changes the auto-reply and media files only; the earlier WhatsApp QA type-only drift is not part of the merge result against the tested base. (d5e97245d9b8)
  • Maintainer proof context: The Telegram maintainer note prefers real Telegram proof for PRs touching visible Telegram transport/reply behavior, which supports a maintainer decision on whether to require Mantis/live proof before merge. (.agents/maintainer-notes/telegram.md:35, 00f98095316a)

Likely related people:

  • steipete: git blame on the current auto-reply media and web-media boundary lines points to the recent shared-coercion refactor, and git log also shows recent media-test stabilization in this area. (role: recent area contributor; confidence: high; commits: 77d9ac30bb8d, a374c3a5bfd5; files: src/auto-reply/reply/reply-media-paths.ts, src/media/web-media.ts)
  • Ayaan Zaidi: git log -S found the earlier reply media path normalization commit that introduced central behavior in the normalizer path this PR changes. (role: reply-media path contributor; confidence: medium; commits: 77ef67246830; files: src/auto-reply/reply/reply-media-paths.ts)
  • Gustavo Madeira Santana: git log -S found the sandboxed media hardening commit related to the sandbox translation boundary involved in this regression. (role: sandbox media hardening contributor; confidence: medium; commits: 4434cae565a3; files: src/agents/sandbox-paths.ts, src/auto-reply/reply/reply-media-paths.ts)
  • Chen Chia Yang: git log -S found the host-local CSV/Markdown upload work that owns the HOST_READ_TEXT_PLAIN_ALIASES boundary this PR deliberately keeps narrow. (role: host-read media boundary contributor; confidence: medium; commits: d2a219ea44d9; files: src/media/web-media.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 25, 2026
@mjamiv mjamiv force-pushed the codex/74061-workspace-media-staging branch from 2b17025 to b038b7a Compare May 25, 2026 15:13
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. mantis: telegram-visible-proof Mantis should capture Telegram visible proof. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Gilded Merge Sprite

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: sleeps inside passing CI.
Image traits: location green-check meadow; accessory little merge flag; palette cobalt, lime, and pearl; mood focused; pose peeking out from the egg shell; shell frosted glass shell; lighting soft studio lighting; background soft code-shaped tiles.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Gilded Merge Sprite in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@mjamiv

mjamiv commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

Sanitized proof/maintainer follow-up for ClawSweeper.

No Telegram bot token, chat ID, user credential, private file content, or proprietary channel data is included here. I also did not upload any local file to Telegram from my workstation.

Local staging proof

Synthetic fixture only:

  • PR/head under discussion: f6887494838cb8df6891978d0244342ad9c494e6
  • Runtime state isolated under /tmp:
    • OPENCLAW_HOME=/tmp/openclaw-86531-proof-home
    • OPENCLAW_STATE_DIR=/tmp/openclaw-86531-proof-state
  • Generated source file: /tmp/openclaw-86531-proof-apvdgX/workspace/reports/report.html
  • workspaceDir: /tmp/openclaw-86531-proof-apvdgX/workspace
  • Marker: pr-86531-proof-1779725125822
  • Source file bytes: 218

I invoked the patched createReplyMediaPathNormalizer with an absolute workspace MEDIA path:

createReplyMediaPathNormalizer({
  cfg: {},
  workspaceDir,
  messageProvider: "telegram",
  accountId: "axel"
})

Observed normalized/staged result:

mediaUrl=/tmp/openclaw-86531-proof-state/media/outbound/report---a71b0e58-0613-424e-b6a6-e55a52025a97.html
mediaUrls=[/tmp/openclaw-86531-proof-state/media/outbound/report---a71b0e58-0613-424e-b6a6-e55a52025a97.html]
staged bytes=218
sourceInsideWorkspace=true

This verifies the patched code path stages an absolute workspace HTML media file into managed outbound media before delivery.

Mantis status

The required native Telegram Desktop proof did not fail because candidate delivery was observed broken. It failed because the Mantis proof environment was unavailable:

  • PR run: https://github.com/openclaw/openclaw/actions/runs/26407973316/job/77735960272
  • Artifact: mantis-telegram-desktop-proof-26407973316-1
  • Artifact summary: Mantis could not capture Telegram Desktop proof because the native Telegram Desktop test environment was unavailable.
  • Candidate status in artifact: skipped
  • Baseline status in artifact: skipped
  • comparison.pass=false because no native capture was possible.

I attempted the safe GitHub-hosted retry paths and did not use local Telegram credentials:

  • gh run rerun 26407973316 --failed was rejected by GitHub: run 26407973316 cannot be rerun; its workflow file may be broken
  • Manual workflow dispatch for PR 86531 was rejected by GitHub: HTTP 403: Must have admin rights to Repository

Maintainer decision needed

ClawSweeper also flagged the intentional host-read MIME boundary change: validated text/html is now included in the document media surface. I cannot provide maintainer acceptance from this account. A maintainer should explicitly either:

  • accept validated text/html as host-readable media for this PR, or
  • ask the author to split the HTML MIME expansion from the absolute-workspace staging fix.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 25, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 25, 2026
@mjamiv

mjamiv commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

Pushed 6b1fe51ee1 to narrow this PR back to the workspace media staging fix. The host-read text/html allowance has been removed from this PR; .html now remains denied under hostReadCapability and extension-declared HTML fails closed instead of falling through to media sniffing. I also updated the PR body with the current proof and validation commands.

Validation after the scope split:

  • pnpm exec vitest run --config test/vitest/vitest.media.config.ts src/auto-reply/reply/reply-media-paths.test.ts src/media/web-media.test.ts
  • node scripts/test-projects.mjs src/auto-reply/reply/reply-media-paths.test.ts src/media/web-media.test.ts
  • pnpm tsgo:prod
  • pnpm check:test-types
  • git diff --check
  • pnpm run test:extensions:package-boundary:compile
  • pnpm run test:extensions:package-boundary:canary

@clawsweeper re-review

@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed proof: supplied External PR includes structured after-fix real behavior proof. labels May 25, 2026
@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot removed the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 25, 2026
@openclaw-barnacle openclaw-barnacle Bot added the proof: supplied External PR includes structured after-fix real behavior proof. label May 25, 2026
@mjamiv

mjamiv commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

Proof gate is now green on 6b1fe51ee1 after fixing the exact PR-body proof field label. The PR body still reflects the narrowed scope: no text/html host-read expansion, and .html remains fail-closed pending separate security review.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

Re-review progress:

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 25, 2026
@steipete steipete self-assigned this May 26, 2026
@steipete

Copy link
Copy Markdown
Contributor

Pre-merge maintainer proof for PR #86531.

Behavior addressed: Telegram/local final-reply MEDIA paths under the agent workspace were dropped when a sandbox workspace existed, producing Media failed instead of staging the file for delivery.

Real environment tested: local maintainer checkout on main, plus existing PR CI/proof on head 6b1fe51.

Exact steps or command run after this patch:

  • git diff --check origin/main...refs/remotes/pull/86531
  • git merge-tree --write-tree origin/main refs/remotes/pull/86531
  • reviewed src/auto-reply/reply/reply-media-paths.ts, src/media/web-media.ts, and focused tests

Evidence after fix: diff check clean; merge-tree produced b2dd46cd7366b687f55f3a02314cf5fbfade20b0 with no conflicts; PR head CI/proof rollup inspected earlier was green for relevant lanes.

Observed result after fix: absolute workspace media is staged before sandbox mapping, while outside-workspace host paths and file URLs remain blocked. HTML host-read remains denied and extension-declared .html fails closed before binary/media sniff fallback.

What was not tested: I did not run a live Telegram Bot API send from the reporter environment; GitHub status endpoints intermittently hit secondary rate limits during landing, so I did not refresh the full check-run list after the last main fast-forward.

@steipete steipete merged commit f22c3a5 into openclaw:main May 26, 2026
162 of 179 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: qa-lab mantis: telegram-visible-proof Mantis should capture Telegram visible proof. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: [Bug] Telegram integration fails to send local media (Returns "Media failed") via <final>MEDIA:...

3 participants