Skip to content

fix(openai): sync OAuth context and retry fixes#302

Merged
Astro-Han merged 3 commits into
devfrom
fix/openai-oauth-context
Apr 28, 2026
Merged

fix(openai): sync OAuth context and retry fixes#302
Astro-Han merged 3 commits into
devfrom
fix/openai-oauth-context

Conversation

@Astro-Han

@Astro-Han Astro-Han commented Apr 28, 2026

Copy link
Copy Markdown
Owner

Summary

Syncs the first missed upstream runtime/provider fixes tracked in #298.

Why

PawWork was showing API-side GPT-5.5 context metadata for OpenAI OAuth sessions even though OAuth requests are routed through the Codex backend. That made the context UI and proactive compaction threshold look safe near 270K tokens while the effective Codex limit was smaller. The retry parser fix is part of the same upstream OpenAI runtime follow-up batch.

Related Issue

Refs #298

How To Verify

bun install --frozen-lockfile
bun test test/plugin/codex.test.ts --timeout 30000
bun test test/session/retry.test.ts --timeout 30000
bun run --cwd packages/opencode typecheck

Screenshots or Recordings

No visible UI changes.

Checklist

  • I linked the related issue, or stated why there is no issue
  • This PR has type, scope, and priority labels, or I requested maintainer labeling
  • I listed the relevant verification steps, including tests when behavior changed
  • I manually checked visible UI or copy changes when needed, with screenshots or recordings
  • I considered macOS and Windows impact for desktop, packaging, updater, signing, paths, shell, or permissions changes
  • I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, or generated/local file changes when relevant
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Summary by CodeRabbit

  • New Features

    • Added model-specific token limits for GPT-5.5 in OAuth-authenticated flows.
  • Bug Fixes

    • Improved stream error parsing and payload handling.
    • Recognize server-side errors as retryable API errors and better extract error messages.
  • Tests

    • Added unit and integration tests for GPT-5.5 OAuth handling and for retry behavior on streaming errors.

@Astro-Han Astro-Han added bug Something isn't working P1 High priority upstream Tracked upstream or vendor behavior harness Model harness, prompts, tool descriptions, and session mechanics labels Apr 28, 2026
@coderabbitai

coderabbitai Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 2bf3ac06-9800-4c0a-b732-bd8d21b406d8

📥 Commits

Reviewing files that changed from the base of the PR and between 6a45ea1 and 32a79ec.

📒 Files selected for processing (4)
  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/src/provider/error.ts
  • packages/opencode/test/plugin/codex.test.ts
  • packages/opencode/test/session/retry.test.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: unit-windows-opencode-config-project
  • GitHub Check: unit-opencode
  • GitHub Check: unit-windows-app
  • GitHub Check: unit-windows-desktop
  • GitHub Check: unit-windows-opencode-session
  • GitHub Check: unit-app
  • GitHub Check: unit-windows-opencode-server-tools
  • GitHub Check: unit-desktop
  • GitHub Check: typecheck
  • GitHub Check: smoke-macos-arm64
  • GitHub Check: analyze-js-ts
  • GitHub Check: e2e-artifacts
🧰 Additional context used
📓 Path-based instructions (2)
packages/opencode/**/*.ts

📄 CodeRabbit inference engine (packages/opencode/AGENTS.md)

packages/opencode/**/*.ts: Use Effect.gen(function* () { ... }) for Effect composition
Use Effect.fn("Domain.method") for named/traced effects and Effect.fnUntraced for internal helpers; these accept pipeable operators as extra arguments to avoid unnecessary outer .pipe() wrappers
Use Effect.callback for callback-based APIs
Prefer DateTime.nowAsDate over new Date(yield* Clock.currentTimeMillis) when you need a Date in Effect code
Use Schema.Class for multi-field data in Effect schemas
Use branded schemas (Schema.brand) for single-value types in Effect
Use Schema.TaggedErrorClass for typed errors in Effect schemas
Use Schema.Defect instead of unknown for defect-like causes in Effect code
In Effect.gen / Effect.fn, prefer yield* new MyError(...) over yield* Effect.fail(new MyError(...)) for direct early-failure branches
Use makeRuntime from src/effect/run-service.ts for all services; it returns { runPromise, runFork, runCallback } backed by a shared memoMap that deduplicates layers
Use InstanceState from src/effect/instance-state.ts for per-directory or per-project state that needs per-instance cleanup; do work directly in the InstanceState.make closure where ScopedCache handles run-once semantics
Use Effect.addFinalizer or Effect.acquireRelease inside the InstanceState.make closure for cleanup (subscriptions, process teardown, etc.)
Use Effect.forkScoped inside the InstanceState.make closure for background stream consumers — the fiber is interrupted when the instance is disposed
Prefer FileSystem.FileSystem instead of raw fs/promises for effectful file I/O in Effect services
Prefer ChildProcessSpawner.ChildProcessSpawner with ChildProcess.make(...) instead of custom process wrappers in Effect services
Prefer HttpClient.HttpClient instead of raw fetch in Effect services
Prefer Path.Path, Config, Clock, and DateTime services when those concerns are already inside Effect code
For backgroun...

Files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
  • packages/opencode/src/provider/error.ts
packages/opencode/test/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (packages/opencode/test/AGENTS.md)

packages/opencode/test/**/*.test.{ts,tsx}: Use the tmpdir function from fixture/fixture.ts to create temporary directories for tests with automatic cleanup. Use await using syntax to ensure automatic cleanup when the variable goes out of scope.
When using the tmpdir function with git repository support, pass the git: true option to initialize a git repo with a root commit.
Use the config option in tmpdir to write an opencode.json config file during test setup by passing a partial Config.Info object.
Use the init option in tmpdir to define custom setup functions that can return extra data accessible via tmp.extra, and use the dispose option for custom cleanup logic.
Use testEffect(...) from test/lib/effect.ts for tests that exercise Effect services or Effect-based workflows.
Use it.effect(...) when the test should run with TestClock and TestConsole. Use it.live(...) when the test depends on real time, filesystem mtimes, child processes, git, locks, or other live OS behavior.
Prefer Effect-aware helpers from fixture/fixture.ts over building manual runtimes in tests: use tmpdirScoped() for scoped temp directories, provideInstance(dir)(effect) for low-level binding without directory creation, provideTmpdirInstance(...) for single temp instance binding, or provideTmpdirServer(...) for tests that also need the test LLM server.
Define const it = testEffect(...) near the top of the test file and keep the test body inside Effect.gen(function* () { ... }). Yield services directly with yield* MyService.Service or yield* MyTool.
Avoid custom ManagedRuntime, attach(...), or ad hoc run(...) wrappers in Effect tests when testEffect(...) already provides the runtime.
When a test needs instance-local state, prefer provideTmpdirInstance(...) or provideInstance(...) over manual Instance.provide(...) inside Promise-style tests.

Files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
🧠 Learnings (29)
📓 Common learnings
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 302
File: packages/opencode/test/session/retry.test.ts:298-315
Timestamp: 2026-04-28T14:46:41.449Z
Learning: In Astro-Han/pawwork (PR `#302`, `packages/opencode/src/provider/error.ts` + `packages/opencode/test/session/retry.test.ts`), upstream anomalyco/opencode#24063 intentionally treats ALL OpenAI streaming `server_error` chunks as retryable (`isRetryable: true`). Do NOT request a test asserting `isRetryable: false` for a `server_error` stream chunk; the only negative regression needed is a non-`server_error` code (e.g. `bad_request`) to confirm the retry gate stays closed for other error codes.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 264
File: packages/opencode/src/session/prompt.ts:108-169
Timestamp: 2026-04-27T10:33:12.228Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/session/prompt.ts` and `packages/opencode/src/session/processor.ts`, PR `#264`), the loop-gate race condition between `buildLoopContext()` and `recordSyntheticBlock`/`recordSyntheticStop` is intentionally handled via idempotence guards (re-check sigKey presence / `hasStopped` inside the record helpers) rather than a full per-parent `Effect.Mutex`. Threading a `Map<MessageID, Mutex>` through the processor was considered too large a surface change for this edge case; the residual TOCTOU window only produces extra synthetic parts with no behavioral drift on the "turn ends" contract. A code comment documents the trade-off and points to a full-mutex follow-up if the race is observed in practice. Do NOT re-flag the absence of a per-parent mutex as a blocking issue in future reviews.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/agent.ts:23-27
Timestamp: 2026-04-28T04:38:11.771Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/tool/agent.ts`), the `subagent_session_id` field in the `Parameters` schema accepts any `Schema.String` rather than a branded `SessionID`. This is inherited upstream behavior (adopted in PR `#270`, an upstream-sync graft of upstream PR `#23244`). The fix — validating `subagent_session_id` as a `SessionID` brand up front so malformed/typo'd IDs fail explicitly rather than silently forking a new subagent session — is intentionally deferred to a follow-up PR or upstream report. Do NOT re-flag this as a blocking issue in PR `#270` or in future upstream-sync PRs that carry the same schema; flag it only in a PawWork-authored PR that touches `agent.ts` parameter validation.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/patch/index.ts:337-346
Timestamp: 2026-04-28T04:38:05.946Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/patch/index.ts`), the BOM-transition surfacing gap — where `Bom.split` strips BOM before building `unified_diff`/`new_content` but `Bom.join` later re-attaches BOM on disk write, meaning BOM changes are not reflected in the diff payload — is intentionally deferred. PR `#270` is an upstream-sync graft; fixing the issue here would mix refactor + bugfix intents and drift the diff from the upstream baseline needed for clean future grafts. A dedicated follow-up PR (or upstream report) will address this. Do NOT re-flag the missing BOM-change surfacing in `ApplyPatchFileUpdate`/`ApplyPatchFileChange` as a blocking issue in PR `#270` or in future sync PRs that carry the same upstream baseline.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/file/ripgrep.ts:469-470
Timestamp: 2026-04-28T08:29:02.858Z
Learning: In `packages/opencode/src/file/ripgrep.ts` (Astro-Han/pawwork, PR `#270`), the `tree()` function filters files with `if (file.includes(".opencode")) continue`, which is a raw substring match that would also drop unrelated paths whose names merely contain `.opencode` (e.g., `.opencode-backup`). The correct fix is `if (file.split(path.sep).includes(".opencode")) continue` to match only the exact path segment. However, upstream `dev:packages/opencode/src/file/ripgrep.ts` carries the identical false-positive filter. PR `#270` is an upstream-sync graft; fixing it here would diverge from the baseline. The fix is deferred to a PawWork ripgrep filter cleanup PR or an upstream-first fix, to land alongside the other ripgrep deferrals (Schema.Class, bytes-union, abort short-circuit, mixed-separator). Do NOT re-flag the `.opencode` substring filter in upstream-sync PRs.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 224
File: packages/app/src/i18n/zh.ts:0-0
Timestamp: 2026-04-24T17:08:46.780Z
Learning: In Astro-Han/pawwork PR `#224`, the first-occurrence `PawWork 爪印` branding rule originally specified in issue `#196` was superseded by an updated Chinese-branding spec. On all zh UI surfaces in `packages/app/src/i18n/zh.ts` (e.g., `dialog.model.unpaid.freeModels.title`, `session.new.subtitle`, `sidebar.gettingStarted.line1`), the correct and intentional target is fully localized `爪印` branding — no `PawWork` prefix. Do NOT flag these strings as missing the first-occurrence `PawWork 爪印` rule in future reviews.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/tool.ts:34-42
Timestamp: 2026-04-28T04:56:21.528Z
Learning: In `packages/opencode/src/tool/tool.ts` (Astro-Han/pawwork, PR `#270`), the `Def.execute` signature uses `ctx: Context` (without the `M` generic) instead of `ctx: Context<M>`, meaning `ctx.metadata(...)` is not checked against the same `M` as `ExecuteResult<M>`. This is inherited upstream behavior adopted via the graft strategy in PR `#270` (an upstream-sync PR per `project_upstream_strategy.md`). Fixing it here would mix refactor + bugfix intents and drift the diff from the upstream baseline. The fix is deferred to a dedicated follow-up PR or upstream report. Do NOT re-flag the missing `Context<M>` generic in `Def.execute` during upstream-sync PRs; only flag it in a PawWork-authored PR that directly touches `tool.ts` type signatures.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/file/ripgrep.ts:45-72
Timestamp: 2026-04-28T07:27:49.810Z
Learning: In `packages/opencode/src/file/ripgrep.ts` (Astro-Han/pawwork), `PathText`, the `lines` field inside `SearchMatch`, and `submatches[].match` accept only `{text: string}` and do NOT handle ripgrep's `{bytes: base64}` JSON variant (emitted for non-UTF-8 paths/match text). This matches the upstream `dev:packages/opencode/src/file/ripgrep.ts` baseline exactly. The fix — adding `Schema.Union([Schema.Struct({text: Schema.String}), Schema.Struct({bytes: Schema.String})])` for each location plus base64-decode in downstream consumers — is intentionally deferred to either a dedicated PawWork PR or an upstream-first fix inherited via the next sync. Do NOT re-flag the missing `{bytes: ...}` union variants in upstream-sync PRs; flag it only in a PawWork-authored PR that directly touches these schema definitions or their downstream consumers.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/lsp.ts:23-32
Timestamp: 2026-04-28T07:28:14.317Z
Learning: In `packages/opencode/src/tool/lsp.ts` (Astro-Han/pawwork, PR `#270`), the `Parameters` schema requires `line` and `character` for all operations, including `workspaceSymbol` and `documentSymbol` which never use coordinates. This matches the upstream `dev:packages/opencode/src/tool/lsp.ts:23-32` baseline exactly — both fields are declared as required `Schema.Number` with `>= 1` checks. The fix (per-operation schema split, or making `line`/`character` optional with handler-side presence validation for operations that need them like `goToDefinition`/`findReferences`) is deferred to a follow-up PR or upstream report to avoid mixing refactor + bug-fix intents and drifting the diff from the upstream baseline. Do NOT re-flag the required coordinates on `workspaceSymbol`/`documentSymbol` in upstream-sync PRs; flag it only in a PawWork-authored PR that directly touches `lsp.ts` parameter validation.
📚 Learning: 2026-04-28T14:46:33.806Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 302
File: packages/opencode/src/plugin/codex.ts:0-0
Timestamp: 2026-04-28T14:46:33.806Z
Learning: In `packages/opencode/src/plugin/codex.ts` (Astro-Han/pawwork), the Provider SDK v1 hook `Model` type omits the `input` field on `limit`, but opencode uses `limit.input` for compaction budget. The pattern to avoid a bare `ts-expect-error` is to declare a local typed intersection `const limit: typeof model.limit & { input: number } = { context: ..., input: ..., output: ... }` and assign that to `model.limit`. Do not flag the intersection cast as unnecessary — it is the intentional way to type-safely add `input` without suppressing the whole assignment.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-21T16:00:44.910Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 102
File: packages/opencode/src/config/model-id.ts:9-11
Timestamp: 2026-04-21T16:00:44.910Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/config/model-id.ts`), `ConfigModelID` uses a regex that requires a non-empty provider segment and a non-empty model segment separated by `/`, but intentionally allows additional slashes within the model segment. This is because `Provider.parseModel` supports model IDs like `synthetic/hf:zai-org/GLM-4.7`. Do not flag the permissive slash handling as a bug.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-24T06:50:05.142Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 211
File: packages/opencode/src/provider/models.ts:113-179
Timestamp: 2026-04-24T06:50:05.142Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/provider/models.ts`), `PublishModel` and `PublishProvider` intentionally duplicate fields from `Model` and `Provider` with looser optionality. The split is a deliberate behavior boundary: `PublishModel`/`PublishProvider` (with `.passthrough()`) are used for lenient publish-time validation of incoming catalogs, while `Model`/`Provider` are used for strict runtime normalization. Do not flag this duplication as a maintenance burden or suggest extracting a shared base schema — the explicit separation is intentional and tied to the models-refresh reliability path introduced in PR `#211`.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-28T04:38:11.771Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/agent.ts:23-27
Timestamp: 2026-04-28T04:38:11.771Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/tool/agent.ts`), the `subagent_session_id` field in the `Parameters` schema accepts any `Schema.String` rather than a branded `SessionID`. This is inherited upstream behavior (adopted in PR `#270`, an upstream-sync graft of upstream PR `#23244`). The fix — validating `subagent_session_id` as a `SessionID` brand up front so malformed/typo'd IDs fail explicitly rather than silently forking a new subagent session — is intentionally deferred to a follow-up PR or upstream report. Do NOT re-flag this as a blocking issue in PR `#270` or in future upstream-sync PRs that carry the same schema; flag it only in a PawWork-authored PR that touches `agent.ts` parameter validation.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-28T08:29:02.858Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/file/ripgrep.ts:469-470
Timestamp: 2026-04-28T08:29:02.858Z
Learning: In `packages/opencode/src/file/ripgrep.ts` (Astro-Han/pawwork, PR `#270`), the `tree()` function filters files with `if (file.includes(".opencode")) continue`, which is a raw substring match that would also drop unrelated paths whose names merely contain `.opencode` (e.g., `.opencode-backup`). The correct fix is `if (file.split(path.sep).includes(".opencode")) continue` to match only the exact path segment. However, upstream `dev:packages/opencode/src/file/ripgrep.ts` carries the identical false-positive filter. PR `#270` is an upstream-sync graft; fixing it here would diverge from the baseline. The fix is deferred to a PawWork ripgrep filter cleanup PR or an upstream-first fix, to land alongside the other ripgrep deferrals (Schema.Class, bytes-union, abort short-circuit, mixed-separator). Do NOT re-flag the `.opencode` substring filter in upstream-sync PRs.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-28T04:56:13.350Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/tool.ts:8-10
Timestamp: 2026-04-28T04:56:13.350Z
Learning: In `packages/opencode/src/tool/tool.ts` (Astro-Han/pawwork, PR `#270`), the `Metadata` interface uses `[key: string]: any` as its index signature. This is upstream-inherited code adopted wholesale via the graft strategy (per `project_upstream_strategy.md`). The fix — tightening to `Record<string, unknown>` and explicitly narrowing framework-owned fields like `truncated` — is intentionally deferred to a follow-up PR or upstream report. Do NOT re-flag the `any` index signature on `Metadata` in PR `#270` or in future upstream-sync PRs that carry the same upstream baseline.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-28T07:27:49.810Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/file/ripgrep.ts:45-72
Timestamp: 2026-04-28T07:27:49.810Z
Learning: In `packages/opencode/src/file/ripgrep.ts` (Astro-Han/pawwork), `PathText`, the `lines` field inside `SearchMatch`, and `submatches[].match` accept only `{text: string}` and do NOT handle ripgrep's `{bytes: base64}` JSON variant (emitted for non-UTF-8 paths/match text). This matches the upstream `dev:packages/opencode/src/file/ripgrep.ts` baseline exactly. The fix — adding `Schema.Union([Schema.Struct({text: Schema.String}), Schema.Struct({bytes: Schema.String})])` for each location plus base64-decode in downstream consumers — is intentionally deferred to either a dedicated PawWork PR or an upstream-first fix inherited via the next sync. Do NOT re-flag the missing `{bytes: ...}` union variants in upstream-sync PRs; flag it only in a PawWork-authored PR that directly touches these schema definitions or their downstream consumers.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/src/provider/error.ts
📚 Learning: 2026-04-28T14:46:41.449Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 302
File: packages/opencode/test/session/retry.test.ts:298-315
Timestamp: 2026-04-28T14:46:41.449Z
Learning: In Astro-Han/pawwork (PR `#302`, `packages/opencode/src/provider/error.ts` + `packages/opencode/test/session/retry.test.ts`), upstream anomalyco/opencode#24063 intentionally treats ALL OpenAI streaming `server_error` chunks as retryable (`isRetryable: true`). Do NOT request a test asserting `isRetryable: false` for a `server_error` stream chunk; the only negative regression needed is a non-`server_error` code (e.g. `bad_request`) to confirm the retry gate stays closed for other error codes.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/src/provider/error.ts
📚 Learning: 2026-04-28T04:56:21.528Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/tool.ts:34-42
Timestamp: 2026-04-28T04:56:21.528Z
Learning: In `packages/opencode/src/tool/tool.ts` (Astro-Han/pawwork, PR `#270`), the `Def.execute` signature uses `ctx: Context` (without the `M` generic) instead of `ctx: Context<M>`, meaning `ctx.metadata(...)` is not checked against the same `M` as `ExecuteResult<M>`. This is inherited upstream behavior adopted via the graft strategy in PR `#270` (an upstream-sync PR per `project_upstream_strategy.md`). Fixing it here would mix refactor + bugfix intents and drift the diff from the upstream baseline. The fix is deferred to a dedicated follow-up PR or upstream report. Do NOT re-flag the missing `Context<M>` generic in `Def.execute` during upstream-sync PRs; only flag it in a PawWork-authored PR that directly touches `tool.ts` type signatures.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-28T04:37:56.001Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/webfetch.ts:12-20
Timestamp: 2026-04-28T04:37:56.001Z
Learning: In `packages/opencode/src/tool/webfetch.ts` (and analogously `packages/opencode/src/tool/read.ts`), numeric parameter bounds (e.g., `timeout > 0 && timeout <= 120`) are intentionally enforced via runtime guards inside `execute()` rather than `Schema.Number.pipe(Schema.greaterThan(0), Schema.lessThanOrEqualTo(120))` refinements. The reason is to preserve a simple `"type": "number"` JSON Schema shape advertised to model callers — schema refinements would add `minimum`/`maximum` constraints that could over-constrain LLM-generated tool calls. Do NOT suggest moving these checks into the `Parameters` schema.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-27T08:58:00.665Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 264
File: packages/opencode/src/session/prompt.ts:531-538
Timestamp: 2026-04-27T08:58:00.665Z
Learning: When using Effect (e.g., `yield*` with `Effect`-style generator yielding), only use `yield* new SomeErrorClass(...)` if `SomeErrorClass` extends `Schema.TaggedErrorClass` (i.e., it implements Effect’s Yieldable interface). For plain `Error` subclasses (like `BlockedLoopError` / `LoopStopError`) or inline `new Error(...)` values, they are not yieldable and must be wrapped as `yield* Effect.fail(new PlainError(...))`. Do not recommend changing `yield* Effect.fail(new SomePlainError(...))` to `yield* new SomePlainError(...)` unless the error class extends `Schema.TaggedErrorClass`.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/src/provider/error.ts
📚 Learning: 2026-04-28T12:49:09.792Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 287
File: packages/opencode/src/session/subagent-run.ts:445-445
Timestamp: 2026-04-28T12:49:09.792Z
Learning: In this repo’s `packages/opencode/src` module files, the canonical module-as-namespace pattern is load-bearing: at the bottom of each module file, add a self-reexport in the form `export * as <ModuleName> from "./<module-file>"` (e.g., in `subagent-run.ts`: `export * as SubagentRun from "./subagent-run"`). Consumers must import the namespace via a named binding (e.g., `import { SubagentRun } from ".../subagent-run"`) and then access members as `SubagentRun.Service`, `SubagentRun.defaultLayer`, `SubagentRun.AgentListFilter`, etc. During code review, do NOT flag these self-reexports as dead code or redundant; removing or altering them will break the consumer named binding.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/src/provider/error.ts
📚 Learning: 2026-04-27T12:59:45.694Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 264
File: packages/opencode/test/session/prompt-effect.test.ts:0-0
Timestamp: 2026-04-27T12:59:45.694Z
Learning: In session prompt/diagnostics tests (e.g., when verifying injected LLM “recovery reminder” copy), do not assert a single exact reminder phrasing tied to signature kind. If a scenario can produce both `input:` and `target:` signatures (for example, a `read` tool call with a `filePath` parameter), accept either phrasing: the input-repeat variant may say “repeated the same tool input 3 times” (literal count), and the target-repeat variant may say “failed against the same target multiple times” (“multiple times” without a count). Your assertions should allow both variants rather than narrowing to only the input-variant text.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-28T11:24:35.312Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 287
File: packages/opencode/test/session/subagent-lifecycle-integration.test.ts:41-47
Timestamp: 2026-04-28T11:24:35.312Z
Learning: In `packages/opencode/test/session/subagent-lifecycle-integration.test.ts` (Astro-Han/pawwork, PR `#287`), the test suite intentionally uses a manual `run` helper (`Effect.runPromise(program.pipe(Effect.provide(Layer.mergeAll(SubagentRun.defaultLayer, Session.defaultLayer)), Effect.orDie))`) and raw `bun:test` test cases rather than the `testEffect(...)` harness. All 27 lifecycle tests pass with this pattern. Migration to `testEffect` + `it.live(...)` is deferred to a dedicated test-harness sweep PR to avoid fixture drift before merge. Do NOT re-flag the manual `Effect.runPromise` runner in this file as needing harness migration until that sweep PR lands.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-28T04:56:21.338Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/test/tool/tool-define.test.ts:2-9
Timestamp: 2026-04-28T04:56:21.338Z
Learning: In `packages/opencode/test/tool/tool-define.test.ts` (Astro-Han/pawwork, PR `#270`), the test suite intentionally uses a `ManagedRuntime.make(Layer.mergeAll(Truncate.defaultLayer, Agent.defaultLayer))` wrapper and raw `bun:test` `test(...)` cases rather than the `testEffect(...)` harness. This code was adopted wholesale from upstream via the graft strategy (per `project_upstream_strategy.md`). Migrating to `testEffect` is deferred to a follow-up PR so as not to mix refactor + harness-migration intents and to avoid drifting the diff from the upstream baseline needed for clean future grafts. Do NOT re-flag the `ManagedRuntime.make` usage in this file as needing harness migration until that follow-up lands.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-28T05:36:24.561Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/test/file/ripgrep.test.ts:0-0
Timestamp: 2026-04-28T05:36:24.561Z
Learning: In `packages/opencode/test/file/ripgrep.test.ts` (Astro-Han/pawwork, PR `#270`), after the streaming-Service refactor of `ripgrep.ts`, there is only one code path that strips `RIPGREP_CONFIG_PATH` from the spawned environment (`delete env.RIPGREP_CONFIG_PATH` at `src/file/ripgrep.ts` line ~155). There is no longer a separate "worker" vs "direct" execution mode, so a single test titled "ignores RIPGREP_CONFIG_PATH from spawned env" (with an explanatory comment) is the correct and complete coverage. Do NOT flag the absence of a separate worker-mode RIPGREP_CONFIG_PATH test as a gap.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-27T11:18:47.332Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 271
File: packages/opencode/test/tool/mcp-exa.test.ts:1-186
Timestamp: 2026-04-27T11:18:47.332Z
Learning: In `packages/opencode/test/tool/mcp-exa.test.ts` (Astro-Han/pawwork), the tests intentionally use raw `bun:test` async cases with `Effect.runPromise(...)` and per-case `HttpClient.make(...)` fakes rather than the `testEffect(...)` harness. The maintainer has explicitly decided not to migrate, because the HttpClient fake wiring is itself the behavior under test and switching to `testEffect` would be style churn without changing risk coverage. Do not flag these tests as needing harness migration.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-27T11:19:24.963Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 271
File: packages/opencode/test/tool/websearch-auth.test.ts:0-0
Timestamp: 2026-04-27T11:19:24.963Z
Learning: In `packages/opencode/test/tool/websearch-auth.test.ts` (Astro-Han/pawwork), the tests intentionally use a small local `runWith` runner with raw `bun:test` and `Effect.runPromise` rather than the `testEffect` harness. Each test case injects a custom in-memory `Auth.Service` layer; switching to `testEffect` would be style-only churn without changing risk coverage. Do not flag these tests as needing harness migration.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-25T12:52:35.631Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 234
File: packages/desktop-electron/src/main/ipc.ts:238-263
Timestamp: 2026-04-25T12:52:35.631Z
Learning: In Astro-Han/pawwork (`packages/desktop-electron/src/main/ipc.ts`), `deps.getServerReadyData()` (backed by `serverReady.promise` in `index.ts`) resolves once at server startup and remains settled; it is not expected to reject in practice. Do not flag the absence of a try-catch around it in the `export-session` IPC handler — the network/fetch layer in `server-client.ts` already has a 10-second AbortController timeout and returns a typed `{ok: false, error}` payload, covering the real failure modes.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-28T05:36:25.456Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/test/tool/grep.test.ts:16-22
Timestamp: 2026-04-28T05:36:25.456Z
Learning: In `packages/opencode/test/tool/grep.test.ts` (Astro-Han/pawwork, PR `#270`), the test suite intentionally uses a manual `ManagedRuntime.make(Layer.mergeAll(...))` setup and raw `bun:test` test cases rather than the `testEffect(...)` harness. The migration to `testEffect` + `it.live(...)` + `provideTmpdirInstance(...)` is a whole-file rewrite deferred to a dedicated sweep PR that will migrate all tool tests still on the manual ManagedRuntime pattern together. Do NOT re-flag the ManagedRuntime setup in this file as needing harness migration until that sweep PR lands.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-28T04:38:21.935Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/test/file/ripgrep.test.ts:9-10
Timestamp: 2026-04-28T04:38:21.935Z
Learning: In `packages/opencode/test/file/ripgrep.test.ts` (Astro-Han/pawwork, PR `#270`), the tests intentionally use a local `run` helper (`effect.pipe(Effect.provide(Ripgrep.defaultLayer), Effect.runPromise)`) rather than the `testEffect(...)` harness. This file was adopted from upstream wholesale as part of an upstream-sync graft; migrating to `testEffect` + `it.live(...)` was deferred to a separate follow-up PR to avoid mixing refactor and harness-migration intents within the sync. Do NOT re-flag the local `run` wrapper as needing harness migration until that follow-up lands.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-27T10:33:12.228Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 264
File: packages/opencode/src/session/prompt.ts:108-169
Timestamp: 2026-04-27T10:33:12.228Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/session/prompt.ts` and `packages/opencode/src/session/processor.ts`, PR `#264`), the loop-gate race condition between `buildLoopContext()` and `recordSyntheticBlock`/`recordSyntheticStop` is intentionally handled via idempotence guards (re-check sigKey presence / `hasStopped` inside the record helpers) rather than a full per-parent `Effect.Mutex`. Threading a `Map<MessageID, Mutex>` through the processor was considered too large a surface change for this edge case; the residual TOCTOU window only produces extra synthetic parts with no behavioral drift on the "turn ends" contract. A code comment documents the trade-off and points to a full-mutex follow-up if the race is observed in practice. Do NOT re-flag the absence of a per-parent mutex as a blocking issue in future reviews.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-21T12:14:30.524Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 93
File: packages/opencode/test/server/cors-middleware.test.ts:15-23
Timestamp: 2026-04-21T12:14:30.524Z
Learning: In `packages/opencode/test/server/cors-middleware.test.ts` (and similar Hono-based test files), `app.request(...)` on the Hono `Hono` app type returns `Promise<Response> | Response` — a union that includes a non-PromiseLike branch. `Effect.promise(...)` requires a `PromiseLike`, so wrapping with `Promise.resolve(app.request(...))` is necessary to normalize the union type. Removing the wrapper causes a TS2322 typecheck error in `packages/opencode`.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-23T08:51:00.819Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 186
File: packages/opencode/test/plugin/workspace-adaptor.test.ts:139-144
Timestamp: 2026-04-23T08:51:00.819Z
Learning: For pawwork tests under packages/opencode/test/**, auth.json teardown may intentionally combine `Filesystem.write` (from `packages/opencode/src/util/filesystem.ts`) with `node:fs/promises` `unlink` for cleanup. Do not flag this as inconsistent style; it is the established/intentional pattern because `Filesystem` does not provide a `remove`/`unlink` helper.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-27T11:18:47.298Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 271
File: packages/opencode/test/tool/websearch.test.ts:21-78
Timestamp: 2026-04-27T11:18:47.298Z
Learning: In `packages/opencode/test/tool/websearch.test.ts`, the tests intentionally use manual `Effect.runPromise` with explicit `Effect.provide(...)` chains (including `Layer.succeed(Auth.Service, ...)`, `Layer.succeed(HttpClient.HttpClient, http)`, `WebSearchAuth.layer`, `Truncate.defaultLayer`, and `Agent.defaultLayer`) rather than the `testEffect(...)` harness. This is by design: the fake Auth and HTTP recovery-metadata layers must be explicitly injected and kept visible/scoped at the test site. Do NOT suggest migrating these tests to `testEffect` or removing the manual layer provides.

Applied to files:

  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-20T14:36:31.032Z
Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Define `const it = testEffect(...)` near the top of the test file and keep the test body inside `Effect.gen(function* () { ... })`. Yield services directly with `yield* MyService.Service` or `yield* MyTool`.

Applied to files:

  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-20T14:36:21.288Z
Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : Use `Schema.TaggedErrorClass` for typed errors in Effect schemas

Applied to files:

  • packages/opencode/src/provider/error.ts
📚 Learning: 2026-04-28T11:24:23.345Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 287
File: packages/opencode/src/session/subagent-run.ts:8-16
Timestamp: 2026-04-28T11:24:23.345Z
Learning: In Astro-Han/pawwork, the established pattern for service-internal errors (e.g., in `packages/opencode/src/session/subagent-run.ts`) is plain classes (with a `readonly _tag` discriminant) or `NamedError.create` (zod-based), NOT `Schema.TaggedErrorClass`. Converting individual modules to `Schema.TaggedErrorClass` one-off would diverge from the surrounding code. The migration to `Schema.TaggedErrorClass` for service-internal errors is intentionally deferred to a dedicated repo-wide error-class sweep PR. Do NOT flag plain error classes in `subagent-run.ts` or other service-internal modules as needing `Schema.TaggedErrorClass` conversion until that sweep PR lands.

Applied to files:

  • packages/opencode/src/provider/error.ts
🔇 Additional comments (11)
packages/opencode/src/provider/error.ts (3)

105-107: Good runtime payload guard.

The isRecord type guard is a solid defensive check before field access and avoids array/object ambiguity in stream payload parsing.


118-118: Type widening to boolean is correct here.

This matches the new parsed stream variants where retryability can be true (server_error) or false (other API errors).


123-169: Wrapper unwrapping + server_error mapping looks correct.

The inner-message JSON unwrap path and fallback message behavior are both implemented cleanly, and the intentional retryability for OpenAI stream server_error is consistent with the upstream backport target.

Based on learnings: In Astro-Han/pawwork PR #302 (packages/opencode/src/provider/error.ts + packages/opencode/test/session/retry.test.ts), upstream intentionally treats all OpenAI streaming server_error chunks as retryable.

packages/opencode/test/session/retry.test.ts (3)

298-315: Strong positive-path regression test.

This clearly verifies stream server_error conversion into retryable MessageV2.APIError and the downstream retry message path.


317-334: Nice fallback coverage for missing error.message.

The explicit assertion for "Server error." locks in the fallback behavior and prevents regressions in parser defaults.


336-352: Good negative gate test using non-server_error code.

This is the right regression check to ensure only intended stream codes become retryable API errors.

Based on learnings: In Astro-Han/pawwork PR #302, the intended negative regression is a non-server_error code (e.g., bad_request) rather than asserting non-retryability for server_error.

packages/opencode/src/plugin/codex.ts (2)

111-113: Good discriminator for the gpt-5.5 family.

This helper is tight and avoids accidental matches like gpt-5.50 while still allowing intended gpt-5.5-* variants.


457-464: Limit override implementation is correct and type-safe.

You’re replacing model.limit atomically with the intended Codex OAuth caps, and the local intersection type cleanly models input without blanket suppression.

Based on learnings: “the pattern to avoid a bare ts-expect-error is to declare a local typed intersection const limit: typeof model.limit & { input: number } = ... and assign that to model.limit.”

packages/opencode/test/plugin/codex.test.ts (3)

3-9: Import updates are consistent with the new test surface.

The added symbol aligns with the newly introduced matcher test block.


148-160: Matcher tests are well-scoped and cover both sides.

Positive and negative cases here are strong and directly guard the regex contract.


162-221: Great regression coverage for OAuth model mutation.

This test correctly validates both pre-override state and post-loader replacement of limits, plus cost zeroing.


📝 Walkthrough

Walkthrough

Adds gpt-5.5-specific token limit assignment during Codex OAuth model setup, enhances streaming error parsing to handle nested/error-wrapped payloads and recognize server_error as retryable, and adds tests covering both Codex OAuth behavior and stream-error retry handling.

Changes

Cohort / File(s) Summary
Codex Model Configuration
packages/opencode/src/plugin/codex.ts
Detects gpt-5.5 API/model IDs, zeroes Codex-related cost fields, and sets explicit limit with context, input, and output token values; exports hasCodexOAuthGpt55Limit.
Error Handling
packages/opencode/src/provider/error.ts
parseStreamError now validates parsed payloads, unwraps Error-like raw.message JSON strings, dispatches on a safely extracted body.error record, treats server_error as a retryable api_error, and makes api_error.isRetryable boolean.
Tests
packages/opencode/test/plugin/codex.test.ts, packages/opencode/test/session/retry.test.ts
Adds unit tests for hasCodexOAuthGpt55Limit and Codex OAuth model limit mutation; adds tests verifying MessageV2.fromError/session retry behavior for server_error streaming chunks and fallback message handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I sniffed the stream and found a clue,
gpt-5.5 limits set just true,
Server errors now jump back in play,
Retries hop forward, clearing the way.
Codex carrots glowing, bright and new!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(openai): sync OAuth context and retry fixes' accurately summarizes the main changes: backporting OAuth context limit fixes for gpt-5.5 and stream error retry handling improvements.
Description check ✅ Passed The description comprehensively covers all required template sections: Summary (backports two upstream fixes), Why (explains the OAuth context metadata issue), Related Issue (#298), How To Verify (with specific test commands), Screenshots/Recordings (appropriately marked as none), and a complete Checklist with all items marked.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/openai-oauth-context

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces specific limit overrides for GPT-5.5 models within the Codex OAuth plugin and improves error handling for server-side errors in the stream parser. The review feedback suggests using a more robust type guard for error objects and recommends using the canonical model API ID for configuration checks to ensure consistency.

Comment thread packages/opencode/src/provider/error.ts Outdated
Comment thread packages/opencode/src/plugin/codex.ts Outdated
@Astro-Han

Copy link
Copy Markdown
Owner Author

Review: fix(openai): sync OAuth context and retry fixes

Overall the two backports are well-scoped and the test coverage is good. A few concerns from the PawWork side, ordered by severity.

P0 — Potential regression: server_error blanket retryability

error.ts now marks all server_error stream chunks as isRetryable: true. Upstream OpenAI sends server_error for a wide range of conditions — some are transient (overloaded), but others are deterministic (bad request shape, content policy violation surfaced late in the stream). Retrying a deterministic error wastes tokens and user time.

Ask: Can you confirm this matches upstream behavior exactly, or should we narrow to a subset of server_error sub-codes / messages? If upstream already does blanket retry, we should at least cap retries with the existing RETRY_MAX_DELAY so it doesn't spin forever on deterministic failures.

P1 — Missing negative test for non-retryable server_error

The retry test only asserts the happy path (isRetryable: true). There is no test for a server_error that should not be retried, or for the fallback message branch when body?.error?.message is missing.

Ask: Add a test for a server_error chunk without a message field to verify the "Server error." fallback, and consider adding a test for a non-retryable stream error to ensure the switch default still returns undefined.

P1 — limit.input is not typed in the model schema

The @ts-expect-error on input: 272_000 is a smell. The Model schema in models.ts types limit.input as z.number().optional(), so the field is typed. The comment says "Provider SDK v1 model limits do not type input", but this is opencode's own schema. If the type error is real, it suggests a mismatch between the runtime object shape and the Model type at this call site.

Ask: Clarify why ts-expect-error is needed here. Is provider.models typed as Record<string, Model>? If so, input should be accepted. If not, that's a type hole we should fix rather than suppress.

P2 — model.id.includes("gpt-5.5") is fragile for future variants

This will match gpt-5.5, gpt-5.5-mini, gpt-5.5-codex, etc. — which may be correct, but it's implicit. shouldKeepCodexOAuthModel uses a regex with semantic version comparison (major > 5 || (major === 5 && minor > 4)), which is more robust. The limit override uses string inclusion, which could accidentally match a future model that has different limits.

Ask: Consider aligning the matching logic with shouldKeepCodexOAuthModel (e.g., reuse the version parser) or document why includes is sufficient here.

P2 — Test does not verify cache limit reset

The test asserts cost is zeroed out, but the limit override happens after the cost zeroing loop. The test does not verify that the original limit values (e.g., context: 1_050_000) are actually overwritten rather than merged.

Ask: Add an assertion that provider.models["gpt-5.5"].limit.context is 400_000 before the loader call to prove the override is replacing, not merging.

P3 — responseBody double-stringification concern

In parseStreamError, when the input is a double-wrapped JSON string ({ message: JSON.stringify({...}) }), the code does:

  1. json(input) → parses the outer wrapper
  2. json(raw.message) → parses the inner payload
  3. JSON.stringify(body) → stringifies back for responseBody

This means responseBody will be the inner JSON stringified, losing the outer wrapper. If the outer wrapper contained useful metadata (e.g., a wrapper type field), it's gone. This is probably fine for the OpenAI case, but it's a subtle behavior change.

Ask: Document in a code comment why the outer wrapper is intentionally discarded for responseBody.

P3 — Nit: commit message scope consistency

Commit 1 is fix(openai): ... and commit 2 is fix(provider): .... Since both touch the OpenAI provider path, consider whether fix(provider) is the right scope, or if fix(openai) would be more consistent. Not blocking.

@Astro-Han

Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Full review triggered.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Inline comment on packages/opencode/src/provider/error.ts:156 (commit 6a45ea1d)

P0 — Potential regression: Marking all server_error chunks as isRetryable: true is risky. OpenAI sends server_error for both transient (overloaded) and deterministic (bad request shape, content policy violation surfaced late in the stream) failures. Retrying deterministic errors wastes tokens and user time. Please confirm this matches upstream behavior exactly, or narrow to a subset of sub-codes/messages. If upstream already does blanket retry, the existing RETRY_MAX_DELAY cap should prevent infinite spin, but it's worth verifying.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Inline comment on packages/opencode/src/plugin/codex.ts:457 (commit 51c5f614)

P1 — Type smell: The @ts-expect-error on input: 272_000 is suspicious. The Model schema in models.ts already types limit.input as z.number().optional(), so the field is typed. The comment says "Provider SDK v1 model limits do not type input", but this is opencode's own schema. If the type error is real, it suggests provider.models is not typed as Model at this call site — that's a type hole we should fix rather than suppress. Can you clarify why this is needed?

@Astro-Han

Copy link
Copy Markdown
Owner Author

Inline comment on packages/opencode/src/plugin/codex.ts:455 (commit 51c5f614)

P2 — Fragile matching: model.id.includes("gpt-5.5") will match gpt-5.5, gpt-5.5-mini, gpt-5.5-codex, etc. This may be correct, but it's implicit. shouldKeepCodexOAuthModel uses semantic version comparison (major > 5 || (major === 5 && minor > 4)), which is more robust. The limit override uses string inclusion, which could accidentally match a future model with different limits. Consider aligning the matching logic or documenting why includes is sufficient here.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Inline comment on packages/opencode/test/session/retry.test.ts:314 (commit 6a45ea1d)

P1 — Missing negative test: The test only asserts the happy path (isRetryable: true). Please add a test for a server_error chunk without a message field to verify the "Server error." fallback, and consider adding a test for a non-retryable stream error to ensure the switch default still returns undefined.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Inline comment on packages/opencode/test/plugin/codex.test.ts:188 (commit 51c5f614)

P2 — Missing assertion: The test asserts cost is zeroed but does not verify that the original limit values are actually overwritten rather than merged. Consider adding an assertion that provider.models["gpt-5.5"].limit.context is 400_000 before the loader call to prove the override is replacing, not merging.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Inline comment on packages/opencode/src/provider/error.ts:121 (commit 6a45ea1d)

P3 — Double-stringification concern: When input is a double-wrapped JSON string ({ message: JSON.stringify({...}) }), the code does: (1) json(input) parses the outer wrapper, (2) json(raw.message) parses the inner payload, (3) JSON.stringify(body) stringifies back for responseBody. This means responseBody will be the inner payload re-stringified, losing the outer wrapper. If the outer wrapper contained useful metadata, it's gone. This is probably fine for the OpenAI case, but please add a code comment explaining why the outer wrapper is intentionally discarded for responseBody.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/opencode/src/plugin/codex.ts`:
- Around line 453-460: Replace the brittle check model.id.includes("gpt-5.5")
with the same semver-based discriminator used by shouldKeepCodexOAuthModel:
inspect model.api.id and run the same regex match (/^gpt-(\d+)\.(\d+)/) to
detect a 5.5 series model before applying the smaller limits in the block that
sets model.limit; keep the existing `@ts-expect-error` on limit.input since the
upstream Provider SDK v1 lacks that type, but change the ID/source used to
model.api.id and the semver match to ensure consistency with
shouldKeepCodexOAuthModel.

In `@packages/opencode/test/session/retry.test.ts`:
- Around line 298-315: Add two negative/regression cases to retry.test.ts using
MessageV2.fromError with ProviderID.make("openai"): (1) create a server_error
chunk where the error object lacks message (e.g., error: { code: "server_error"
}) and assert MessageV2.APIError.isInstance(result) is true, (result as
MessageV2.APIError).data.isRetryable is true, and SessionRetry.retryable(result)
equals the fallback "Server error."; (2) create a server_error chunk that
explicitly marks non-retryable (e.g., error: { code: "server_error", message:
"X", retryable: false }) and assert MessageV2.APIError.isInstance(result) is
true, (result as MessageV2.APIError).data.isRetryable is false, and
SessionRetry.retryable(result) returns null/does not indicate a retryable
message.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: a905950c-3f84-4f7d-b92e-b93c86b1f7da

📥 Commits

Reviewing files that changed from the base of the PR and between 6461f6c and 6a45ea1.

📒 Files selected for processing (4)
  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/src/provider/error.ts
  • packages/opencode/test/plugin/codex.test.ts
  • packages/opencode/test/session/retry.test.ts
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
packages/opencode/**/*.ts

📄 CodeRabbit inference engine (packages/opencode/AGENTS.md)

packages/opencode/**/*.ts: Use Effect.gen(function* () { ... }) for Effect composition
Use Effect.fn("Domain.method") for named/traced effects and Effect.fnUntraced for internal helpers; these accept pipeable operators as extra arguments to avoid unnecessary outer .pipe() wrappers
Use Effect.callback for callback-based APIs
Prefer DateTime.nowAsDate over new Date(yield* Clock.currentTimeMillis) when you need a Date in Effect code
Use Schema.Class for multi-field data in Effect schemas
Use branded schemas (Schema.brand) for single-value types in Effect
Use Schema.TaggedErrorClass for typed errors in Effect schemas
Use Schema.Defect instead of unknown for defect-like causes in Effect code
In Effect.gen / Effect.fn, prefer yield* new MyError(...) over yield* Effect.fail(new MyError(...)) for direct early-failure branches
Use makeRuntime from src/effect/run-service.ts for all services; it returns { runPromise, runFork, runCallback } backed by a shared memoMap that deduplicates layers
Use InstanceState from src/effect/instance-state.ts for per-directory or per-project state that needs per-instance cleanup; do work directly in the InstanceState.make closure where ScopedCache handles run-once semantics
Use Effect.addFinalizer or Effect.acquireRelease inside the InstanceState.make closure for cleanup (subscriptions, process teardown, etc.)
Use Effect.forkScoped inside the InstanceState.make closure for background stream consumers — the fiber is interrupted when the instance is disposed
Prefer FileSystem.FileSystem instead of raw fs/promises for effectful file I/O in Effect services
Prefer ChildProcessSpawner.ChildProcessSpawner with ChildProcess.make(...) instead of custom process wrappers in Effect services
Prefer HttpClient.HttpClient instead of raw fetch in Effect services
Prefer Path.Path, Config, Clock, and DateTime services when those concerns are already inside Effect code
For backgroun...

Files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/test/plugin/codex.test.ts
  • packages/opencode/src/provider/error.ts
packages/opencode/test/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (packages/opencode/test/AGENTS.md)

packages/opencode/test/**/*.test.{ts,tsx}: Use the tmpdir function from fixture/fixture.ts to create temporary directories for tests with automatic cleanup. Use await using syntax to ensure automatic cleanup when the variable goes out of scope.
When using the tmpdir function with git repository support, pass the git: true option to initialize a git repo with a root commit.
Use the config option in tmpdir to write an opencode.json config file during test setup by passing a partial Config.Info object.
Use the init option in tmpdir to define custom setup functions that can return extra data accessible via tmp.extra, and use the dispose option for custom cleanup logic.
Use testEffect(...) from test/lib/effect.ts for tests that exercise Effect services or Effect-based workflows.
Use it.effect(...) when the test should run with TestClock and TestConsole. Use it.live(...) when the test depends on real time, filesystem mtimes, child processes, git, locks, or other live OS behavior.
Prefer Effect-aware helpers from fixture/fixture.ts over building manual runtimes in tests: use tmpdirScoped() for scoped temp directories, provideInstance(dir)(effect) for low-level binding without directory creation, provideTmpdirInstance(...) for single temp instance binding, or provideTmpdirServer(...) for tests that also need the test LLM server.
Define const it = testEffect(...) near the top of the test file and keep the test body inside Effect.gen(function* () { ... }). Yield services directly with yield* MyService.Service or yield* MyTool.
Avoid custom ManagedRuntime, attach(...), or ad hoc run(...) wrappers in Effect tests when testEffect(...) already provides the runtime.
When a test needs instance-local state, prefer provideTmpdirInstance(...) or provideInstance(...) over manual Instance.provide(...) inside Promise-style tests.

Files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
🧠 Learnings (13)
📓 Common learnings
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 264
File: packages/opencode/src/session/prompt.ts:108-169
Timestamp: 2026-04-27T10:33:12.228Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/session/prompt.ts` and `packages/opencode/src/session/processor.ts`, PR `#264`), the loop-gate race condition between `buildLoopContext()` and `recordSyntheticBlock`/`recordSyntheticStop` is intentionally handled via idempotence guards (re-check sigKey presence / `hasStopped` inside the record helpers) rather than a full per-parent `Effect.Mutex`. Threading a `Map<MessageID, Mutex>` through the processor was considered too large a surface change for this edge case; the residual TOCTOU window only produces extra synthetic parts with no behavioral drift on the "turn ends" contract. A code comment documents the trade-off and points to a full-mutex follow-up if the race is observed in practice. Do NOT re-flag the absence of a per-parent mutex as a blocking issue in future reviews.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/agent.ts:23-27
Timestamp: 2026-04-28T04:38:11.771Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/tool/agent.ts`), the `subagent_session_id` field in the `Parameters` schema accepts any `Schema.String` rather than a branded `SessionID`. This is inherited upstream behavior (adopted in PR `#270`, an upstream-sync graft of upstream PR `#23244`). The fix — validating `subagent_session_id` as a `SessionID` brand up front so malformed/typo'd IDs fail explicitly rather than silently forking a new subagent session — is intentionally deferred to a follow-up PR or upstream report. Do NOT re-flag this as a blocking issue in PR `#270` or in future upstream-sync PRs that carry the same schema; flag it only in a PawWork-authored PR that touches `agent.ts` parameter validation.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/patch/index.ts:337-346
Timestamp: 2026-04-28T04:38:05.946Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/patch/index.ts`), the BOM-transition surfacing gap — where `Bom.split` strips BOM before building `unified_diff`/`new_content` but `Bom.join` later re-attaches BOM on disk write, meaning BOM changes are not reflected in the diff payload — is intentionally deferred. PR `#270` is an upstream-sync graft; fixing the issue here would mix refactor + bugfix intents and drift the diff from the upstream baseline needed for clean future grafts. A dedicated follow-up PR (or upstream report) will address this. Do NOT re-flag the missing BOM-change surfacing in `ApplyPatchFileUpdate`/`ApplyPatchFileChange` as a blocking issue in PR `#270` or in future sync PRs that carry the same upstream baseline.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/tool/tool.ts:34-42
Timestamp: 2026-04-28T04:56:21.528Z
Learning: In `packages/opencode/src/tool/tool.ts` (Astro-Han/pawwork, PR `#270`), the `Def.execute` signature uses `ctx: Context` (without the `M` generic) instead of `ctx: Context<M>`, meaning `ctx.metadata(...)` is not checked against the same `M` as `ExecuteResult<M>`. This is inherited upstream behavior adopted via the graft strategy in PR `#270` (an upstream-sync PR per `project_upstream_strategy.md`). Fixing it here would mix refactor + bugfix intents and drift the diff from the upstream baseline. The fix is deferred to a dedicated follow-up PR or upstream report. Do NOT re-flag the missing `Context<M>` generic in `Def.execute` during upstream-sync PRs; only flag it in a PawWork-authored PR that directly touches `tool.ts` type signatures.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/file/ripgrep.ts:469-470
Timestamp: 2026-04-28T08:29:02.858Z
Learning: In `packages/opencode/src/file/ripgrep.ts` (Astro-Han/pawwork, PR `#270`), the `tree()` function filters files with `if (file.includes(".opencode")) continue`, which is a raw substring match that would also drop unrelated paths whose names merely contain `.opencode` (e.g., `.opencode-backup`). The correct fix is `if (file.split(path.sep).includes(".opencode")) continue` to match only the exact path segment. However, upstream `dev:packages/opencode/src/file/ripgrep.ts` carries the identical false-positive filter. PR `#270` is an upstream-sync graft; fixing it here would diverge from the baseline. The fix is deferred to a PawWork ripgrep filter cleanup PR or an upstream-first fix, to land alongside the other ripgrep deferrals (Schema.Class, bytes-union, abort short-circuit, mixed-separator). Do NOT re-flag the `.opencode` substring filter in upstream-sync PRs.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/src/file/ripgrep.ts:45-72
Timestamp: 2026-04-28T07:27:49.810Z
Learning: In `packages/opencode/src/file/ripgrep.ts` (Astro-Han/pawwork), `PathText`, the `lines` field inside `SearchMatch`, and `submatches[].match` accept only `{text: string}` and do NOT handle ripgrep's `{bytes: base64}` JSON variant (emitted for non-UTF-8 paths/match text). This matches the upstream `dev:packages/opencode/src/file/ripgrep.ts` baseline exactly. The fix — adding `Schema.Union([Schema.Struct({text: Schema.String}), Schema.Struct({bytes: Schema.String})])` for each location plus base64-decode in downstream consumers — is intentionally deferred to either a dedicated PawWork PR or an upstream-first fix inherited via the next sync. Do NOT re-flag the missing `{bytes: ...}` union variants in upstream-sync PRs; flag it only in a PawWork-authored PR that directly touches these schema definitions or their downstream consumers.
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 224
File: packages/app/src/i18n/zh.ts:0-0
Timestamp: 2026-04-24T17:08:46.780Z
Learning: In Astro-Han/pawwork PR `#224`, the first-occurrence `PawWork 爪印` branding rule originally specified in issue `#196` was superseded by an updated Chinese-branding spec. On all zh UI surfaces in `packages/app/src/i18n/zh.ts` (e.g., `dialog.model.unpaid.freeModels.title`, `session.new.subtitle`, `sidebar.gettingStarted.line1`), the correct and intentional target is fully localized `爪印` branding — no `PawWork` prefix. Do NOT flag these strings as missing the first-occurrence `PawWork 爪印` rule in future reviews.
📚 Learning: 2026-04-27T12:59:45.694Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 264
File: packages/opencode/test/session/prompt-effect.test.ts:0-0
Timestamp: 2026-04-27T12:59:45.694Z
Learning: In session prompt/diagnostics tests (e.g., when verifying injected LLM “recovery reminder” copy), do not assert a single exact reminder phrasing tied to signature kind. If a scenario can produce both `input:` and `target:` signatures (for example, a `read` tool call with a `filePath` parameter), accept either phrasing: the input-repeat variant may say “repeated the same tool input 3 times” (literal count), and the target-repeat variant may say “failed against the same target multiple times” (“multiple times” without a count). Your assertions should allow both variants rather than narrowing to only the input-variant text.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-28T11:24:35.312Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 287
File: packages/opencode/test/session/subagent-lifecycle-integration.test.ts:41-47
Timestamp: 2026-04-28T11:24:35.312Z
Learning: In `packages/opencode/test/session/subagent-lifecycle-integration.test.ts` (Astro-Han/pawwork, PR `#287`), the test suite intentionally uses a manual `run` helper (`Effect.runPromise(program.pipe(Effect.provide(Layer.mergeAll(SubagentRun.defaultLayer, Session.defaultLayer)), Effect.orDie))`) and raw `bun:test` test cases rather than the `testEffect(...)` harness. All 27 lifecycle tests pass with this pattern. Migration to `testEffect` + `it.live(...)` is deferred to a dedicated test-harness sweep PR to avoid fixture drift before merge. Do NOT re-flag the manual `Effect.runPromise` runner in this file as needing harness migration until that sweep PR lands.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
📚 Learning: 2026-04-23T08:51:00.819Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 186
File: packages/opencode/test/plugin/workspace-adaptor.test.ts:139-144
Timestamp: 2026-04-23T08:51:00.819Z
Learning: For pawwork tests under packages/opencode/test/**, auth.json teardown may intentionally combine `Filesystem.write` (from `packages/opencode/src/util/filesystem.ts`) with `node:fs/promises` `unlink` for cleanup. Do not flag this as inconsistent style; it is the established/intentional pattern because `Filesystem` does not provide a `remove`/`unlink` helper.

Applied to files:

  • packages/opencode/test/session/retry.test.ts
  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-21T16:00:44.910Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 102
File: packages/opencode/src/config/model-id.ts:9-11
Timestamp: 2026-04-21T16:00:44.910Z
Learning: In Astro-Han/pawwork (`packages/opencode/src/config/model-id.ts`), `ConfigModelID` uses a regex that requires a non-empty provider segment and a non-empty model segment separated by `/`, but intentionally allows additional slashes within the model segment. This is because `Provider.parseModel` supports model IDs like `synthetic/hf:zai-org/GLM-4.7`. Do not flag the permissive slash handling as a bug.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
📚 Learning: 2026-04-27T08:58:00.665Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 264
File: packages/opencode/src/session/prompt.ts:531-538
Timestamp: 2026-04-27T08:58:00.665Z
Learning: When using Effect (e.g., `yield*` with `Effect`-style generator yielding), only use `yield* new SomeErrorClass(...)` if `SomeErrorClass` extends `Schema.TaggedErrorClass` (i.e., it implements Effect’s Yieldable interface). For plain `Error` subclasses (like `BlockedLoopError` / `LoopStopError`) or inline `new Error(...)` values, they are not yieldable and must be wrapped as `yield* Effect.fail(new PlainError(...))`. Do not recommend changing `yield* Effect.fail(new SomePlainError(...))` to `yield* new SomePlainError(...)` unless the error class extends `Schema.TaggedErrorClass`.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/src/provider/error.ts
📚 Learning: 2026-04-28T12:49:09.792Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 287
File: packages/opencode/src/session/subagent-run.ts:445-445
Timestamp: 2026-04-28T12:49:09.792Z
Learning: In this repo’s `packages/opencode/src` module files, the canonical module-as-namespace pattern is load-bearing: at the bottom of each module file, add a self-reexport in the form `export * as <ModuleName> from "./<module-file>"` (e.g., in `subagent-run.ts`: `export * as SubagentRun from "./subagent-run"`). Consumers must import the namespace via a named binding (e.g., `import { SubagentRun } from ".../subagent-run"`) and then access members as `SubagentRun.Service`, `SubagentRun.defaultLayer`, `SubagentRun.AgentListFilter`, etc. During code review, do NOT flag these self-reexports as dead code or redundant; removing or altering them will break the consumer named binding.

Applied to files:

  • packages/opencode/src/plugin/codex.ts
  • packages/opencode/src/provider/error.ts
📚 Learning: 2026-04-27T11:19:24.963Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 271
File: packages/opencode/test/tool/websearch-auth.test.ts:0-0
Timestamp: 2026-04-27T11:19:24.963Z
Learning: In `packages/opencode/test/tool/websearch-auth.test.ts` (Astro-Han/pawwork), the tests intentionally use a small local `runWith` runner with raw `bun:test` and `Effect.runPromise` rather than the `testEffect` harness. Each test case injects a custom in-memory `Auth.Service` layer; switching to `testEffect` would be style-only churn without changing risk coverage. Do not flag these tests as needing harness migration.

Applied to files:

  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-27T11:18:47.298Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 271
File: packages/opencode/test/tool/websearch.test.ts:21-78
Timestamp: 2026-04-27T11:18:47.298Z
Learning: In `packages/opencode/test/tool/websearch.test.ts`, the tests intentionally use manual `Effect.runPromise` with explicit `Effect.provide(...)` chains (including `Layer.succeed(Auth.Service, ...)`, `Layer.succeed(HttpClient.HttpClient, http)`, `WebSearchAuth.layer`, `Truncate.defaultLayer`, and `Agent.defaultLayer`) rather than the `testEffect(...)` harness. This is by design: the fake Auth and HTTP recovery-metadata layers must be explicitly injected and kept visible/scoped at the test site. Do NOT suggest migrating these tests to `testEffect` or removing the manual layer provides.

Applied to files:

  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-28T04:56:21.338Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 270
File: packages/opencode/test/tool/tool-define.test.ts:2-9
Timestamp: 2026-04-28T04:56:21.338Z
Learning: In `packages/opencode/test/tool/tool-define.test.ts` (Astro-Han/pawwork, PR `#270`), the test suite intentionally uses a `ManagedRuntime.make(Layer.mergeAll(Truncate.defaultLayer, Agent.defaultLayer))` wrapper and raw `bun:test` `test(...)` cases rather than the `testEffect(...)` harness. This code was adopted wholesale from upstream via the graft strategy (per `project_upstream_strategy.md`). Migrating to `testEffect` is deferred to a follow-up PR so as not to mix refactor + harness-migration intents and to avoid drifting the diff from the upstream baseline needed for clean future grafts. Do NOT re-flag the `ManagedRuntime.make` usage in this file as needing harness migration until that follow-up lands.

Applied to files:

  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-20T14:36:31.032Z
Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Define `const it = testEffect(...)` near the top of the test file and keep the test body inside `Effect.gen(function* () { ... })`. Yield services directly with `yield* MyService.Service` or `yield* MyTool`.

Applied to files:

  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-22T09:32:54.556Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 126
File: packages/opencode/test/provider/provider.test.ts:64-85
Timestamp: 2026-04-22T09:32:54.556Z
Learning: In `packages/opencode/test/provider/provider.test.ts`, the file intentionally uses AppRuntime-based async helpers (`run`, `list`, `getProvider`, etc.) rather than `testEffect(...)` for all tests. Converting individual tests to `testEffect` while leaving the rest on the async pattern would create internal inconsistency. A full harness migration of this file is the right approach if the pattern needs to change, but that should be a separate PR.

Applied to files:

  • packages/opencode/test/plugin/codex.test.ts
📚 Learning: 2026-04-28T11:24:23.345Z
Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 287
File: packages/opencode/src/session/subagent-run.ts:8-16
Timestamp: 2026-04-28T11:24:23.345Z
Learning: In Astro-Han/pawwork, the established pattern for service-internal errors (e.g., in `packages/opencode/src/session/subagent-run.ts`) is plain classes (with a `readonly _tag` discriminant) or `NamedError.create` (zod-based), NOT `Schema.TaggedErrorClass`. Converting individual modules to `Schema.TaggedErrorClass` one-off would diverge from the surrounding code. The migration to `Schema.TaggedErrorClass` for service-internal errors is intentionally deferred to a dedicated repo-wide error-class sweep PR. Do NOT flag plain error classes in `subagent-run.ts` or other service-internal modules as needing `Schema.TaggedErrorClass` conversion until that sweep PR lands.

Applied to files:

  • packages/opencode/src/provider/error.ts
🔇 Additional comments (4)
packages/opencode/test/plugin/codex.test.ts (1)

147-199: Solid regression coverage for the OAuth Codex override.

This locks in both the GPT-5.5 limit override and the zero-cost mutation, so the new behavior is protected end-to-end.

packages/opencode/src/provider/error.ts (3)

112-115: isRetryable being required here matches the downstream contract.

This aligns with MessageV2.APIError consumption and avoids undefined leaking into retry handling.


118-120: Confirm the outer envelope is disposable.

Parsing raw.message as JSON and then replacing body with the inner object silently drops any sibling fields on the wrapper. If this stream shape is guaranteed to be {"message": "<json>"} only, please document that contract; otherwise keep the wrapper and parse the nested payload separately.


154-160: No action needed. OpenAI's official documentation explicitly classifies server_error as a transient server-side failure that should be retried. The code correctly implements this per the upstream API contract, and other error codes with semantic meaning (quota, usage limits, invalid input) are already appropriately marked as non-retryable.

Comment thread packages/opencode/src/plugin/codex.ts Outdated
Comment thread packages/opencode/test/session/retry.test.ts
@Astro-Han

Copy link
Copy Markdown
Owner Author

Review follow-up pushed in 32a79ec.

P0 server_error retryability: verified upstream anomalyco/opencode#24063 treats OpenAI stream server_error as retryable. Kept parity and did not add sub-code/message filtering in this sync PR. Also checked the local retry policy: RETRY_MAX_DELAY caps delay length, not attempt count, so I did not rely on it as an infinite-retry guard.

P1 tests and typing: added fallback coverage for "Server error." and a non-retryable default-path stream error with bad_request. Removed the ts-expect-error by using a local typed limit object that bridges the old SDK hook Model shape to opencode's input-aware limit usage.

P2 matching and overwrite coverage: switched the limit override to canonical model.api.id with an explicit /^gpt-5.5(?:$|-)/ matcher, added variant and nonmatch tests, and asserted the original 1,050,000 limit before loader mutation.

P3 responseBody and commit scope: documented why the inner OpenAI provider payload is used for responseBody. Left the existing commit scopes in place and added this review-fix commit for the follow-up.

Inline review threads have been replied to and resolved. CodeRabbit's retryable:false server_error suggestion was intentionally not applied because it conflicts with the upstream behavior this PR is syncing.

@Astro-Han Astro-Han merged commit cfe2fac into dev Apr 28, 2026
25 checks passed
@Astro-Han Astro-Han deleted the fix/openai-oauth-context branch April 28, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working harness Model harness, prompts, tool descriptions, and session mechanics P1 High priority upstream Tracked upstream or vendor behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant