Skip to content

fix(pdf): time out idle remote PDF reads#84768

Merged
shakkernerd merged 2 commits into
openclaw:mainfrom
luoyanglang:wolf/pdf-tool-remote-read-idle-timeout
May 21, 2026
Merged

fix(pdf): time out idle remote PDF reads#84768
shakkernerd merged 2 commits into
openclaw:mainfrom
luoyanglang:wolf/pdf-tool-remote-read-idle-timeout

Conversation

@luoyanglang

@luoyanglang luoyanglang commented May 21, 2026

Copy link
Copy Markdown
Contributor

Remote PDF reads could hang a session when a remote PDF server sent headers and then stopped sending body bytes; this threads the existing shared idle-read timeout through the PDF/web-media path so the tool returns instead of staying stuck.

Fixes #68649

Affected surface

  • src/agents/tools/pdf-tool.ts
    • remote PDF load path in createPdfTool
  • src/media/web-media.ts
    • loadWebMediaRaw / readRemoteMediaBuffer option threading
  • src/agents/tools/pdf-tool.test.ts
  • src/media/web-media.test.ts

Scope

  • Reuses the existing shared readIdleTimeoutMs / readResponseWithLimit contract.
  • Adds no PDF-specific config, option, retry policy, or second timeout strategy.
  • Keeps local and sandboxed PDF paths unchanged; the idle timeout is only supplied for http(s) PDF references.
  • Does not change /new, /restart, or broader stuck-session recovery policy.

Real behavior proof

  • Behavior or issue addressed: A controlled remote PDF endpoint that sends headers and initial PDF bytes but then stops sending body data now returns from the remote media load path with an idle-read timeout error instead of hanging indefinitely.
  • Real environment tested: Linux executor, Node via node --import tsx, local HTTP server bound to 127.0.0.1, URL redacted as <local-stall-url>. The exercised OpenClaw path was loadWebMediaRaw -> readRemoteMediaBuffer -> shared readResponseWithLimit idle timeout contract. No secrets were used; SSRF private-network allowance was enabled only for this local controlled harness.
  • Exact steps or command run after this patch:
    1. Rebased to latest origin/main (8284c035a0).
    2. Started a local HTTP server that responds to /stall.pdf.
    3. Returned content-type: application/pdf, content-length, and initial PDF bytes.
    4. Left the response body open without ending it.
    5. Called loadWebMediaRaw(<local-stall-url>, { maxBytes, readIdleTimeoutMs: 300, ssrfPolicy: { allowPrivateNetwork: true } }).
    6. Observed whether the call returned with the configured idle timeout.
  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):
    Controlled stalled remote PDF proof:
    $ node --import tsx --input-type=module -e '<local controlled stalled-PDF harness>'
    [proof] method=loadWebMediaRaw url=<local-stall-url> readIdleTimeoutMs=300 maxBytes=10485760
    [server] method=GET path=/stall.pdf
    [proof] result=error elapsedMs=573 errorName=MediaFetchError message=Failed to fetch media from <local-stall-url>: Media download stalled: no data received for 300ms
    
    Targeted tests:
    $ pnpm test src/agents/tools/pdf-tool.test.ts src/media/web-media.test.ts
    [test] starting test/vitest/vitest.media.config.ts
    Test Files  1 passed (1)
    Tests  47 passed (47)
    
    [test] starting test/vitest/vitest.agents.config.ts
    Test Files  1 passed (1)
    Tests  20 passed (20)
    
    [test] passed 2 Vitest shards in 59.97s
    
    Full changed-check validation:
    $ pnpm check:changed --base origin/main
    [check:changed] lanes=core, coreTests
    [check:changed] src/agents/tools/pdf-tool.test.ts: core test
    [check:changed] src/agents/tools/pdf-tool.ts: core production
    [check:changed] src/media/web-media.test.ts: core test
    [check:changed] src/media/web-media.ts: core production
    ...
    Found 0 warnings and 0 errors.
    runtime-sidecar-loaders: local runtime sidecar loaders look OK.
    Import cycle check: 0 runtime value cycle(s).
    exit 0
    
  • Observed result after fix: The controlled stalled remote PDF request returned after 573ms with MediaFetchError: Media download stalled: no data received for 300ms. That shows the after-fix path does not stay stuck on an idle remote PDF body read.
  • What was not tested: I did not reproduce the reporter's original 244-page remote PDF endpoint or WhatsApp production session. I did not test /new or /restart stuck-session recovery behavior; that is outside this narrow PDF body-read timeout fix.
  • Before evidence (optional but encouraged): Issue pdf tool can hang indefinitely, blocking session and preventing /new /restart recovery #68649 includes logs for a remote PDF tool call that never returned. ClawSweeper's source review also confirmed current main omits readIdleTimeoutMs on the remote PDF load path even though the lower-level body reader only enforces an idle timeout when that option is supplied.

Additional checks:

$ git diff --check origin/main..HEAD
exit 0

$ node scripts/check-no-conflict-markers.mjs
exit 0

$ gitleaks protect --staged --redact
0 commits scanned; no leaks found

$ gitleaks detect --source . --redact --log-opts origin/main..HEAD
1 commits scanned; no leaks found

Coverage note

Push-before-PR/update search was rerun after syncing origin/main; no separate open PR appears to cover this issue/path. The only Fixes #68649 / #68649 result is this PR (#84768). Other broad media PRs returned by readResponseWithLimit search do not cover the PDF loadWebMediaRaw caller missing readIdleTimeoutMs.

Helper reuse direction: this is a caller-threading fix into the existing shared read-idle timeout contract. It does not alter fetch.ts / readResponseWithLimit semantics and does not introduce a PDF-only timeout policy.

What was not tested

  • I did not reproduce the reporter's original 244-page remote PDF endpoint or WhatsApp production session.
  • I did not test /new or /restart stuck-session recovery behavior; that is outside this narrow PDF body-read timeout fix.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 21, 2026
@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR adds a 120s idle-read timeout to remote PDF loads, threads that option through loadWebMediaRaw to readRemoteMediaBuffer, and adds PDF/web-media regression tests.

Reproducibility: yes. at source/proof level. Current main omits readIdleTimeoutMs on the PDF/web-media remote path while the lower-level reader only enforces the idle timeout when supplied, and the PR body now includes controlled stalled-PDF after-fix output.

PR rating
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Summary: The PR has sufficient controlled runtime proof and a focused patch with no blocking findings, with only the fixed-timeout compatibility tradeoff left for maintainer review.

Rank-up moves:

  • none
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Sufficient (live_output): The PR body includes after-fix copied live output from a controlled stalled remote PDF server showing loadWebMediaRaw returning with the configured idle-timeout error.

Risk before merge

  • The new 120s idle-read limit intentionally turns remote PDF responses with no body bytes for more than 120 seconds into errors; maintainers should accept that compatibility tradeoff before merge.
  • The linked issue also asks for /new, /restart, and broader session-watchdog recovery; this PR only fixes the remote PDF body-read hang and leaves that broader recovery policy outside scope.

Maintainer options:

  1. Accept the PDF timeout tradeoff (recommended)
    Land the PR with the 120s remote-PDF idle limit, accepting that responses which stop yielding body bytes for longer now fail instead of hanging the session.
  2. Make the timeout policy explicit first
    If maintainers want operator control, require a follow-up patch before merge that reuses the existing media timeout contract without inventing a second PDF-specific policy.

Next step before merge
Maintainers should review and accept the 120s remote-PDF idle-timeout compatibility tradeoff; there is no narrow automated repair needed from this review.

Security
Cleared: The diff only threads an existing timeout option through media loading and adds tests; it does not change dependencies, workflows, secrets, permissions, or network trust policy.

Review details

Best possible solution:

Land the narrow PDF/web-media timeout fix after maintainer review accepts the fixed 120s remote-PDF idle limit, while leaving broader stuck-session recovery in the linked issue or a separate follow-up.

Do we have a high-confidence way to reproduce the issue?

Yes, at source/proof level. Current main omits readIdleTimeoutMs on the PDF/web-media remote path while the lower-level reader only enforces the idle timeout when supplied, and the PR body now includes controlled stalled-PDF after-fix output.

Is this the best way to solve the issue?

Yes for the narrow PDF hang. The PR reuses the existing readResponseWithLimit idle-read contract and adds caller-threading plus regression tests instead of introducing a separate PDF timeout mechanism.

Label changes:

  • add P2: This is a focused user-visible stuck-tool bug fix with limited code surface and targeted regression coverage.
  • add merge-risk: 🚨 compatibility: The diff changes remote PDF behavior from potentially waiting indefinitely to failing after a fixed 120s idle body-read window.
  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix copied live output from a controlled stalled remote PDF server showing loadWebMediaRaw returning with the configured idle-timeout error.
  • add rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🐚 platinum hermit, patch quality is 🐚 platinum hermit, and The PR has sufficient controlled runtime proof and a focused patch with no blocking findings, with only the fixed-timeout compatibility tradeoff left for maintainer review.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-fix copied live output from a controlled stalled remote PDF server showing loadWebMediaRaw returning with the configured idle-timeout error.
  • remove rating: 🧂 unranked krab: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
  • remove status: 📣 needs proof: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

  • P2: This is a focused user-visible stuck-tool bug fix with limited code surface and targeted regression coverage.
  • merge-risk: 🚨 compatibility: The diff changes remote PDF behavior from potentially waiting indefinitely to failing after a fixed 120s idle body-read window.
  • rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🐚 platinum hermit, patch quality is 🐚 platinum hermit, and The PR has sufficient controlled runtime proof and a focused patch with no blocking findings, with only the fixed-timeout compatibility tradeoff left for maintainer review.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-fix copied live output from a controlled stalled remote PDF server showing loadWebMediaRaw returning with the configured idle-timeout error.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix copied live output from a controlled stalled remote PDF server showing loadWebMediaRaw returning with the configured idle-timeout error.

What I checked:

  • Current PDF caller lacks idle timeout: Current main calls loadWebMediaRaw for non-sandboxed PDF references with maxBytes, localRoots, and SSRF policy but without readIdleTimeoutMs, so remote body reads rely on the lower-level default behavior. (src/agents/tools/pdf-tool.ts:444, e42726204490)
  • Current web-media raw path does not forward idle timeout: Current main's HTTP branch in loadWebMediaInternal passes remote media fetch options to readRemoteMediaBuffer without a readIdleTimeoutMs field. (src/media/web-media.ts:520, e42726204490)
  • Lower-level contract already supports idle read cancellation: readRemoteMediaBufferOnce passes options.readIdleTimeoutMs as chunkTimeoutMs to readResponseWithLimit, and readResponseWithLimit cancels a stalled stream when that timeout expires. (src/media/fetch.ts:609, e42726204490)
  • PR threads the option through the PDF path: The PR commit adds PDF_REMOTE_READ_IDLE_TIMEOUT_MS = 120_000 and passes it only for HTTP(S) PDF references, leaving local and sandboxed reads unchanged. (src/agents/tools/pdf-tool.ts:55, 9fbbb80f17d2)
  • PR extends web-media option plumbing: The same PR commit adds readIdleTimeoutMs to WebMediaOptions, destructures it, and forwards it into readRemoteMediaBuffer for HTTP(S) media loads. (src/media/web-media.ts:52, 9fbbb80f17d2)
  • Regression coverage targets caller and stalled stream behavior: The PR adds a PDF-tool assertion that remote PDFs pass the 120s timeout and web-media tests that a stalling PDF stream rejects while an active PDF response still loads. (src/media/web-media.test.ts:732, 9fbbb80f17d2)

Likely related people:

  • @steipete: git shortlog --all shows the most commits across the central PDF/media files, and commits b289441e... and 81445a90... introduced/refined the shared response limiter and idle-read snippet behavior used by this PR. (role: shared media timeout and heavy area contributor; confidence: high; commits: b289441e6feb, 81445a901091; files: src/media/fetch.ts, src/media/read-response-with-limit.ts)
  • @vincentkoc: git log --all -S 'function loadWebMediaRaw' points to 73539ac..., which created src/media/web-media.ts and the raw web-media seam this PR extends. (role: web-media seam introducer; confidence: high; commits: 73539ac78720; files: src/media/web-media.ts)
  • @tysoncung: Commit 4d501e4... added readIdleTimeoutMs to remote media fetches, which is the lower-level contract this PR threads into the PDF/web-media caller. (role: idle-timeout contract introducer; confidence: medium; commits: 4d501e4ccf2f; files: src/media/fetch.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against e42726204490.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 21, 2026
@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Tiny Crabkin

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: collects tiny proofs.
Image traits: location diff observatory; accessory rollback rope; palette amber, ink, and glacier blue; mood calm; pose leaning over a miniature review desk; shell starlit enamel shell; lighting warm desk-lamp glow; background delicate sparkle particles.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Tiny Crabkin in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@luoyanglang luoyanglang force-pushed the wolf/pdf-tool-remote-read-idle-timeout branch from d6f942d to 9fbbb80 Compare May 21, 2026 06:42
@luoyanglang

Copy link
Copy Markdown
Contributor Author

Updated the PR body with controlled stalled remote PDF proof, rebased onto latest origin/main, and pushed the refreshed branch. @clawsweeper re-review

@luoyanglang

Copy link
Copy Markdown
Contributor Author

Reformatted the proof section with the required behavior/environment/steps/evidence/observedResult/notTested fields while preserving the controlled stalled PDF evidence. @clawsweeper re-review

@luoyanglang

Copy link
Copy Markdown
Contributor Author

Adjusted the real behavior proof to use the exact template field lines that the proof checker parses. @clawsweeper re-review

@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 21, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 21, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 21, 2026
@shakkernerd shakkernerd force-pushed the wolf/pdf-tool-remote-read-idle-timeout branch from 16cce1d to 2b9c1db Compare May 21, 2026 08:31
@shakkernerd shakkernerd merged commit e3b77d6 into openclaw:main May 21, 2026
23 checks passed
@luoyanglang luoyanglang deleted the wolf/pdf-tool-remote-read-idle-timeout branch May 21, 2026 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: S status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdf tool can hang indefinitely, blocking session and preventing /new /restart recovery

2 participants