Skip to content

media: make MEDIA_MAX_BYTES configurable via media.maxBytes#40950

Closed
BertieJim wants to merge 3 commits into
openclaw:mainfrom
BertieJim:fix/configurable-media-max-bytes
Closed

media: make MEDIA_MAX_BYTES configurable via media.maxBytes#40950
BertieJim wants to merge 3 commits into
openclaw:mainfrom
BertieJim:fix/configurable-media-max-bytes

Conversation

@BertieJim

@BertieJim BertieJim commented Mar 9, 2026

Copy link
Copy Markdown

Summary

  • Problem: MEDIA_MAX_BYTES (5 MB) is hardcoded in src/media/store.ts — users cannot override it, so files >5 MB silently fail to stage into sandbox containers.
  • Why it matters: Common file types (PDFs, presentations, high-res images) routinely exceed 5 MB, making sandboxed agents unable to process them.
  • What changed: Added media.maxBytes config field; replaced hardcoded constant with getMediaMaxBytes()/setMediaMaxBytes() pattern; gateway boot reads config and sets the limit.
  • What did NOT change (scope boundary): Per-channel download limits (e.g. Telegram's mediaMaxMb) are untouched. tools.media.*.maxBytes (AI understanding limits) are untouched. Default behavior (5 MB) is preserved when config is unset.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • New config field media.maxBytes (integer, optional, max 500 MB) controls the inbound media staging/store limit.
  • Error messages now show the actual configured limit (e.g. "Media exceeds 10MB limit") instead of hardcoded "5MB".
  • Default remains 5 MB when media.maxBytes is not set — no behavior change for existing users.

Example config:

{
  "media": {
    "maxBytes": 52428800
  }
}

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No — the limit only raises an existing cap; the media store directory and trust boundary are unchanged.

Repro + Verification

Environment

  • OS: macOS (Darwin 25.3.0) + Ubuntu (CI)
  • Runtime/container: Node 22, pnpm
  • Model/provider: N/A (infrastructure change)
  • Integration/channel: Any channel with sandboxed agents
  • Relevant config: { "media": { "maxBytes": 52428800 } }

Steps

  1. Set media.maxBytes in openclaw.json to e.g. 52428800 (50 MB)
  2. Configure a sandboxed agent (sandbox.mode: "non-main")
  3. Send a file larger than 5 MB (e.g. 10 MB PDF)

Expected

  • File stages into sandbox successfully

Actual (before fix)

  • File rejected / not available inside sandbox

Evidence

  • Failing test/log before + passing after — all 56 tests pass (store.test.ts, server.test.ts, stage-sandbox-media.test.ts, schema.help.quality.test.ts)
  • Trace/log snippets — N/A (type-check + lint clean)
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: type-check (pnpm tsgo), lint/format (pnpm check), 56 unit tests across 4 test files, secrets baseline scan
  • Edge cases checked: config absent (resets to default on boot), non-round byte values display correctly in error messages (e.g. 4.77MB not 4MB), schema rejects values >500 MB
  • What I did not verify: end-to-end manual test with a running gateway and real >5 MB file transfer (CI integration tests cover the store/staging paths)

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

All 3 bot review threads were addressed and resolved:

  1. greptile: setMediaMaxBytes not reset when config absent → fixed with unconditional call using ?? MEDIA_MAX_BYTES_DEFAULT
  2. greptile: Truncated MB in error messages → fixed with formatBytesAsMb() helper using toFixed(2)
  3. chatgpt-codex: Same as fix: add @lid format support and allowFrom wildcard handling #1 → addressed by same fix

Compatibility / Migration

  • Backward compatible? YesMEDIA_MAX_BYTES still exported (deprecated) for any external consumers; default 5 MB preserved.
  • Config/env changes? Yes — new optional media.maxBytes field.
  • Migration needed? No — field is optional with sensible default.

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Remove media.maxBytes from openclaw.json and restart. The 5 MB default is restored.
  • Files/config to restore: openclaw.json (remove media.maxBytes key)
  • Known bad symptoms reviewers should watch for: Unexpectedly large files consuming disk in the media store directory if limit is set very high.

Risks and Mitigations

  • Risk: User sets maxBytes extremely high, filling disk with large media files.
    • Mitigation: Schema enforces max(500 * 1024 * 1024) (500 MB cap). media.ttlHours provides automatic cleanup.

🤖 Generated with Claude Code

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: S labels Mar 9, 2026
@greptile-apps

greptile-apps Bot commented Mar 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR makes the MEDIA_MAX_BYTES limit configurable via media.maxBytes in the top-level config block, replacing the hardcoded 5 MB constant in store.ts with a mutable module-level variable backed by getMediaMaxBytes()/setMediaMaxBytes() accessors. The change is backward compatible (the old MEDIA_MAX_BYTES export is preserved with a deprecation notice), and the schema, types, labels, and help text are all updated consistently.

Key observations:

  • The conditional guard in server.impl.ts (if (typeof ... === "number")) only calls setMediaMaxBytes when the config key is present. In a multi-invocation scenario within the same process (integration tests, in-process config reload), omitting media.maxBytes from config will not reset the limit back to the 5 MB default — leaving a stale value from a prior call. Calling setMediaMaxBytes unconditionally with cfgAtStart.media?.maxBytes ?? MEDIA_MAX_BYTES_DEFAULT would fix this.
  • Error messages across three callsites use toFixed(0) to render the byte limit as a whole-MB number. For limits that are not exact multiples of 1 MiB (e.g. 5000000 bytes ≈ 4.77 MB), the message would report "4MB" instead of the true limit — potentially confusing when users try to tune the value.
  • All other plumbing (Zod validation, TypeScript types, getter calls in server.ts and stage-sandbox-media.ts) looks correct and clean.

Confidence Score: 3/5

  • Mostly safe to merge; one logic issue in server.impl.ts could leave a stale byte limit in multi-invocation test environments or in-process config reloads.
  • The implementation is clean and well-scoped. The logic issue (conditional vs unconditional setMediaMaxBytes call) is low-severity for production single-process deployments but is a real test-isolation risk given how the gateway test helpers cache and reuse the server module across test cases in the same worker. The MB-rounding in error messages is a minor UX issue that doesn't affect correctness.
  • src/gateway/server.impl.ts — the conditional setMediaMaxBytes call should be made unconditional to prevent stale state across restarts within the same process.

Last reviewed commit: 55e7a88

Comment thread src/gateway/server.impl.ts Outdated
Comment thread src/media/store.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55e7a8882e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/gateway/server.impl.ts Outdated
@BertieJim BertieJim force-pushed the fix/configurable-media-max-bytes branch from 29b77ed to 2761753 Compare March 9, 2026 12:18
Add `media.maxBytes` to the top-level media config block (Zod schema, types,
labels, help text) so users can override the hardcoded 5 MB inbound media
staging/store limit.

- Add setMediaMaxBytes()/getMediaMaxBytes() to store.ts (follows existing
  setMediaStoreNetworkDepsForTest pattern for module-level state)
- Wire config read in gateway boot (server.impl.ts), always resetting to
  default when config omits the field
- Update server.ts and stage-sandbox-media.ts to use the getter
- Add formatBytesAsMb() helper for accurate error messages (toFixed(2))
- Schema caps the value at 500 MB; default remains 5 MB when unset

Closes openclaw#40880
@BertieJim BertieJim force-pushed the fix/configurable-media-max-bytes branch from 4f92643 to 97d5150 Compare March 9, 2026 14:26

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 97d51500bc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/media/store.ts
Comment on lines +377 to +381
maxBytes = mediaMaxBytes,
originalFilename?: string,
): Promise<SavedMedia> {
if (buffer.byteLength > maxBytes) {
throw new Error(`Media exceeds ${(maxBytes / (1024 * 1024)).toFixed(0)}MB limit`);
throw new Error(`Media exceeds ${formatBytesAsMb(maxBytes)} limit`);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce global media cap when per-channel maxBytes is passed

saveMediaBuffer now defaults maxBytes to the configured global limit, but the check still uses only the caller-provided argument, so any channel path that passes its own limit can bypass media.maxBytes. For example, inbound paths in src/web/inbound/monitor.ts and src/telegram/bot/delivery.resolve-media.ts pass channel-specific values, which means setting media.maxBytes lower (e.g. 10MB) does not prevent larger files from being persisted in the media store despite the new config being documented as the maximum accepted inbound size for store/staging.

Useful? React with 👍 / 👎.

@BertieJim

Copy link
Copy Markdown
Author

Please Merge this PR, it will help me a lot. The bug is severe, Merge this PR can fix it immediately.

@klingai-dev

Copy link
Copy Markdown

Plz merge this pr. since video generation is released, large video files are very much needed.

@openclaw-barnacle openclaw-barnacle Bot removed the gateway Gateway runtime label Apr 9, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d8cc6c0bbd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

rootDir: effectiveWorkspaceDir,
relativeDestPath: relativeDest,
maxBytes: STAGED_MEDIA_MAX_BYTES,
maxBytes: getMediaMaxBytes(),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Import getMediaMaxBytes in sandbox media staging

stageSandboxMedia now calls getMediaMaxBytes() when copying attachments, but this module never imports that symbol from src/media/store.ts. In runtime, each call throws a ReferenceError inside the try, then the catch logs and continues, so valid inbound files are silently skipped instead of being staged into the sandbox.

Useful? React with 👍 / 👎.

Comment thread src/media/server.ts
rootDir: mediaDir,
relativePath: id,
maxBytes: MAX_MEDIA_BYTES,
maxBytes: getMediaMaxBytes(),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Import getMediaMaxBytes in media download route

The /media/:id handler uses getMediaMaxBytes() without importing it. When this line executes, it throws ReferenceError, which falls into the generic error path and returns 404, so valid media IDs can no longer be served by this route.

Useful? React with 👍 / 👎.

Comment thread src/media/store.ts
}

/** Override the media max-bytes limit (called from gateway boot after config load). */
export function setMediaMaxBytes(bytes: number): void {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wire media.maxBytes into startup limit initialization

This commit adds setMediaMaxBytes and a new media.maxBytes config field, but no startup path actually calls the setter (repo search for setMediaMaxBytes( at this commit only finds this declaration). As a result, the effective limit stays at the default 5 MB and the new configuration knob has no effect.

Useful? React with 👍 / 👎.

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@clawsweeper

clawsweeper Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Thanks for the context here. I swept through the related work, and this is now duplicate or superseded.

Close this PR as superseded by the open canonical issue: the requested configurability remains valid, but this stale conflicting branch has correctness blockers, no real behavior proof, and no longer matches current main closely enough to keep as the landing path.

Canonical path: Keep #40880 as the canonical tracker and replace this stale branch with a current-main PR that defines the config contract, wires store/staging/chat preflight consistently, and includes real gateway proof.

So I’m closing this here and keeping the remaining discussion on #40880.

Review details

Best possible solution:

Keep #40880 as the canonical tracker and replace this stale branch with a current-main PR that defines the config contract, wires store/staging/chat preflight consistently, and includes real gateway proof.

Do we have a high-confidence way to reproduce the issue?

Yes, by source inspection and existing coverage: current main still has fixed 5 MiB store, sandbox staging, and chat preflight caps, and the gateway test constructs a 6 MiB PDF that is rejected as over the sandbox staging limit. I did not run a live gateway transfer because this review is read-only.

Is this the best way to solve the issue?

No, this branch is not the best current solution because it is stale, conflicting, has runtime import bugs, and does not initialize or thread the configured cap through the current chat.send preflight path. A narrow replacement on current main is the safer path.

Security review:

Security review needs attention: No dependency or workflow supply-chain concern was found, but the media resource-limit boundary is inconsistently defined and not applied by the PR head.

  • [medium] Clarify media cap precedence and enforcement — src/media/store.ts:427
    The new setting is described as the maximum accepted inbound media size, but caller-provided maxBytes paths and the current chat.send sandbox preflight can still enforce different limits; that ambiguity matters for disk/resource limits and attachment delivery.
    Confidence: 0.74

What I checked:

  • current main fixed media-store cap: Current main still exports MEDIA_MAX_BYTES = 5 * 1024 * 1024 and uses it as the default media-store cap, so the requested behavior is not implemented on main. (src/media/store.ts:22, d7a078f1962b)
  • current main fixed sandbox staging cap: Sandbox staging aliases STAGED_MEDIA_MAX_BYTES to MEDIA_MAX_BYTES and passes that fixed cap to staging copies. (src/auto-reply/reply/stage-sandbox-media.ts:21, d7a078f1962b)
  • current main chat preflight still uses fixed cap: The chat.send sandbox preflight rejects offloaded attachments above MEDIA_MAX_BYTES, so a replacement fix must wire any configured cap through this current-main path too. (src/gateway/server-methods/chat.ts:1007, d7a078f1962b)
  • current main config contract lacks media.maxBytes: The strict top-level media schema currently exposes preserveFilenames and ttlHours, but not maxBytes. (src/config/zod-schema.ts:726, d7a078f1962b)
  • source-level regression coverage exists for the remaining issue: Current gateway coverage constructs a 6 MiB PDF above the 5 MiB staging cap and expects a sandbox staging-limit rejection, which is a source-level reproduction of the reported gap. (src/gateway/server-methods/chat.directive-tags.test.ts:3266, d7a078f1962b)
  • PR branch is conflicting and stale: Live PR metadata reports mergeable: CONFLICTING; the branch was opened on 2026-03-09 and its head remains d8cc6c0bbd55a91b90bf67a1ef673f6422b6d508. (d8cc6c0bbd55)

Likely related people:

  • steipete: Recent media-store and current-main sweep history touches the fixed cap and adjacent media safety work. (role: recent area contributor; confidence: high; commits: 53d007bc878c, dd07fb400fc5; files: src/media/store.ts, src/auto-reply/reply/stage-sandbox-media.ts, src/gateway/server-methods/chat.ts)
  • samzong: Introduced the current non-image attachment offload path where the sandbox preflight cap must be kept in sync with the staging limit. (role: gateway attachment offload contributor; confidence: high; commits: 25ef9c0c419d; files: src/gateway/server-methods/chat.ts, src/gateway/chat-attachments.ts, src/auto-reply/reply/stage-sandbox-media.ts)
  • vincentkoc: Recent media-store work overlaps the media retention and config-adjacent surface affected by this requested setting. (role: recent media/config contributor; confidence: medium; commits: 2bdbf240a92d, ba9eaf2ee219; files: src/media/store.ts, src/config/zod-schema.ts)
  • gumadeiras: Recent sandbox media root resolution work is adjacent to the staging path whose fixed cap this PR changes. (role: sandbox staging contributor; confidence: medium; commits: f334ca2b509f; files: src/auto-reply/reply/stage-sandbox-media.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against d7a078f1962b.

@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Apr 28, 2026
@clawsweeper clawsweeper Bot added the P2 Normal backlog priority with limited blast radius. label May 16, 2026
@openclaw-barnacle openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 16, 2026
@clawsweeper clawsweeper Bot added impact:message-loss Channel message delivery can be lost, duplicated, or misrouted. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. and removed impact:message-loss Channel message delivery can be lost, duplicated, or misrouted. labels May 17, 2026
@clawsweeper clawsweeper Bot added merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. labels May 18, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper applied the proposed close for this PR.

@clawsweeper clawsweeper Bot closed this May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: tools/sandbox: MEDIA_MAX_BYTES (5 MB) is hardcoded and not user-configurable

2 participants