Skip to content

fix: trigger --wait retry on codex stderr-only quota errors#311

Merged
umputun merged 1 commit intomasterfrom
fix-codex-stderr-limit-pattern
Apr 30, 2026
Merged

fix: trigger --wait retry on codex stderr-only quota errors#311
umputun merged 1 commit intomasterfrom
fix-codex-stderr-limit-pattern

Conversation

@umputun
Copy link
Copy Markdown
Owner

@umputun umputun commented Apr 29, 2026

Codex emits ERROR: You've hit your usage limit to stderr while stdout is empty on failure. Pattern matching scanned stdout only, so the limit pattern never matched and runWithLimitRetry saw a generic error rather than *LimitPatternError. With --wait set, the process exited instead of retrying.

Fix

  • processStderr now scans each incoming line live for limit/error patterns before the 5-line / 256-rune tail truncation, so detection is eviction- and truncation-resistant. The error-context tail stays unchanged for human-readable reporting.
  • Stderr scan is gated by isCodexErrorLine (matches error:/fatal:/panic: prefix, case-insensitive) so progress chatter (header banners, bold summaries, model thinking that may legitimately mention "rate limit" while reviewing code) cannot trigger false positives.
  • checkPatterns priority is limit-class first across both sources: stdout limit -> stderr limit -> stdout error -> stderr error. Within a class stdout wins. A real prefix-gated stderr quota diagnostic cannot be downgraded to a non-retryable PatternMatchError when partial stdout matches an ErrorPattern.
  • You've hit your usage limit added to default codex_limit_patterns and codex_error_patterns. Users who customized the previous default (Rate limit,quota exceeded) need to append the new wording manually or comment the line out to inherit embedded defaults; README.md and llms.txt include the upgrade note.

Tests

12 new tests in pkg/executor/codex_test.go cover stderr-only matches (limit and error), eviction resistance with >5 trailing lines, 256-rune truncation resistance, prefix-gate suppression of non-error chatter, stdout-vs-stderr precedence within and across classes, cancellation handling, the clean-exit guard, and the isCodexErrorLine helper directly.

Related to #308.

Codex's OpenAI/ChatGPT plan-quota error ("ERROR: You've hit your usage
limit") is emitted on stderr while stdout is empty on failure. Previously
pattern matching scanned stdout only, so --wait could never retry on it
and the process exited with a generic error.

Fix:
- processStderr now scans each incoming line live for limit/error
  patterns before tail truncation, so detection is eviction- and
  truncation-resistant (the 5-line / 256-rune error-context buffer
  remains for human-readable error messages only).
- The stderr scan is gated by isCodexErrorLine (matches error/fatal/
  panic prefix, case-insensitive) so progress chatter — header banners,
  bold summaries, model thinking that may legitimately mention "rate
  limit" while reviewing code — cannot trigger false positives.
- checkPatterns priority is limit-class first across both sources:
  stdout limit -> stderr limit -> stdout error -> stderr error. A real
  prefix-gated stderr quota diagnostic cannot be downgraded to a
  non-retryable PatternMatchError when partial stdout matches an
  ErrorPattern. Within a class, stdout wins over stderr.
- "You've hit your usage limit" added to default codex_limit_patterns
  and codex_error_patterns. Users who customized the previous default
  (Rate limit,quota exceeded) need to append the new wording manually
  or comment the line out to inherit embedded defaults; README.md and
  llms.txt include the upgrade note.

Tests cover stderr-only matches (limit and error), eviction resistance
with >5 trailing lines, 256-rune truncation resistance, prefix-gate
suppression of non-error chatter, stdout-vs-stderr precedence within
and across classes, cancellation handling, the clean-exit guard, and
the isCodexErrorLine helper directly.

Related to #308
Copilot AI review requested due to automatic review settings April 29, 2026 22:30
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying ralphex with  Cloudflare Pages  Cloudflare Pages

Latest commit: 4eb2ddd
Status: ✅  Deploy successful!
Preview URL: https://f064f2ea.ralphex.pages.dev
Branch Preview URL: https://fix-codex-stderr-limit-patte.ralphex.pages.dev

View logs

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes --wait/limit-retry behavior for Codex failures where the quota/limit diagnostic is emitted only on stderr (with empty stdout), by making pattern detection resilient to stderr tail truncation/eviction and by defining clear stdout-vs-stderr precedence.

Changes:

  • Add live, per-line stderr scanning for limit/error patterns (prefix-gated) and unify pattern priority across stdout+stderr in CodexExecutor.
  • Expand default Codex limit/error patterns to include You've hit your usage limit.
  • Add a comprehensive Codex stderr-focused test suite and update docs with an upgrade note for users with customized patterns.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
pkg/executor/codex.go Live stderr pattern scan (prefix-gated), new unified precedence logic via checkPatterns, and extended stderrResult.
pkg/executor/codex_test.go Adds targeted tests covering stderr-only matches, truncation/eviction resistance, precedence, cancellation, and isCodexErrorLine.
pkg/config/values_test.go Updates expected embedded defaults to include the new Codex pattern string.
pkg/config/defaults/config Updates default codex_*_patterns to include You've hit your usage limit.
pkg/config/config.go Formatting-only alignment change in config struct assembly.
README.md Documents updated defaults and adds an explicit upgrade note for customized pattern lists.
llms.txt Mirrors README guidance on updated defaults and stderr scanning behavior.
CLAUDE.md Updates internal docs to describe the new stderr scanning + precedence behavior and updated defaults.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@umputun umputun merged commit 0f5afcc into master Apr 30, 2026
9 checks passed
@umputun umputun deleted the fix-codex-stderr-limit-pattern branch April 30, 2026 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants