Skip to content

fix(translation): hint upstream model to omit empty pages on Read tool#437

Merged
icebear0828 merged 1 commit intodevfrom
fix/read-pages-empty-string-hint
May 3, 2026
Merged

fix(translation): hint upstream model to omit empty pages on Read tool#437
icebear0828 merged 1 commit intodevfrom
fix/read-pages-empty-string-hint

Conversation

@icebear0828
Copy link
Copy Markdown
Owner

Summary

Upstream gpt-5.x tends to fill optional string fields with "" rather than
omit them. For Claude Code's Read tool, that means pages: "" even on
non-PDF files — the harness then routes the empty string through the PDF
branch and errors out.

The proxy is otherwise pass-through for tool_use arguments, so the cleanest
nudge is to append a one-line hint to the pages property's description in
the schema we forward upstream. The model reads property descriptions when
deciding whether to fill a field; this is the cheapest, least invasive fix.

Changes

  • src/translation/tool-format.ts — new augmentReadToolSchema helper, called from anthropicToolsToCodex only when name === "Read". Appends " Omit this field entirely for non-PDF files; do not pass an empty string." to the pages property description. Idempotent (won't double-append on multi-turn replays). Zero impact on other tools.
  • tests/unit/translation/tool-format.test.ts — 5 new specs covering the augmentation, idempotency, non-Read isolation, schema with no pages field, and preservation of sibling fields.
  • CHANGELOG.md[Unreleased] / Fixed entry.

Test Plan

  • npx vitest run tests/unit/translation/tool-format.test.ts — 60/60 pass
  • npx vitest run — 1654 pass, 1 skipped, 145 files
  • npx tsc --noEmit — clean

Notes

  • The test file's diff stat looks bigger than the real change because the file in HEAD has mixed CRLF/LF line endings (460 CRLF + 224 LF) — pre-existing project mess. Use git diff --ignore-cr-at-eol tests/unit/translation/tool-format.test.ts to see the actual delta is +93 LOC. Happy to follow up with a separate normalization PR if desired.
  • This is a temporary nudge until the underlying fix happens at the harness side (treat pages === "" as unset). I can also file an issue against anthropics/claude-code if helpful.

Upstream gpt-5.x tends to fill optional string fields (Read tool's `pages`)
with `""` rather than omit them. Claude Code harness then routes the empty
string into the PDF branch and errors. Lightest-touch nudge: in
`anthropicToolsToCodex`, when the tool name is "Read", append a sentence
to the `pages` property description telling the model to omit the field
for non-PDF files.

Idempotent (safe across multi-turn replays), zero impact on other tools,
keeps Codex tool_use arguments otherwise pass-through. 5 new vitest
specs cover the augmentation, idempotency, non-Read tool isolation, and
field preservation.
@icebear0828 icebear0828 force-pushed the fix/read-pages-empty-string-hint branch from 5698200 to 7515feb Compare May 3, 2026 10:24
@icebear0828 icebear0828 merged commit fbe4aad into dev May 3, 2026
1 check passed
@icebear0828 icebear0828 deleted the fix/read-pages-empty-string-hint branch May 3, 2026 11:53
icebear0828 added a commit that referenced this pull request May 5, 2026
The soak check measures `now - dev_HEAD_timestamp >= 24h`, which means
every new merge into dev resets the clock. Under any non-trivial merge
cadence, dev never satisfies the soak gate and master stagnates: PRs
#437/#438/#439/#440/#442 all stacked on dev for a week with no
promotion.

Add a `force_skip_soak` boolean input to workflow_dispatch (default
false). Schedule cron remains untouched and continues to enforce the
24h rule. Only manual triggers can bypass, and only when the operator
explicitly sets the input to true — intended for sync-back / merge
commits whose content has actually been on dev long enough but whose
HEAD timestamp is misleadingly fresh.

Test plan: yaml syntax verified via js-yaml. Functional verification
will be the next manual workflow_dispatch run with the input set.

Co-authored-by: icebear0828 <icebear0828@users.noreply.github.com>
icebear0828 added a commit that referenced this pull request May 5, 2026
#437)

Upstream gpt-5.x tends to fill optional string fields (Read tool's `pages`)
with `""` rather than omit them. Claude Code harness then routes the empty
string into the PDF branch and errors. Lightest-touch nudge: in
`anthropicToolsToCodex`, when the tool name is "Read", append a sentence
to the `pages` property description telling the model to omit the field
for non-PDF files.

Idempotent (safe across multi-turn replays), zero impact on other tools,
keeps Codex tool_use arguments otherwise pass-through. 5 new vitest
specs cover the augmentation, idempotency, non-Read tool isolation, and
field preservation.

Co-authored-by: icebear0828 <icebear0828@users.noreply.github.com>
icebear0828 added a commit that referenced this pull request May 5, 2026
The soak check measures `now - dev_HEAD_timestamp >= 24h`, which means
every new merge into dev resets the clock. Under any non-trivial merge
cadence, dev never satisfies the soak gate and master stagnates: PRs
#437/#438/#439/#440/#442 all stacked on dev for a week with no
promotion.

Add a `force_skip_soak` boolean input to workflow_dispatch (default
false). Schedule cron remains untouched and continues to enforce the
24h rule. Only manual triggers can bypass, and only when the operator
explicitly sets the input to true — intended for sync-back / merge
commits whose content has actually been on dev long enough but whose
HEAD timestamp is misleadingly fresh.

Test plan: yaml syntax verified via js-yaml. Functional verification
will be the next manual workflow_dispatch run with the input set.

Co-authored-by: icebear0828 <icebear0828@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant