fix: add session loop diagnostics by Astro-Han · Pull Request #204 · Astro-Han/pawwork

Astro-Han · 2026-04-24T03:01:47Z

Summary

Add lightweight session diagnostics for repeated tool behavior:

record per-tool loop diagnostics in tool part state metadata
detect repeated tool inputs and repeated tool error classes within the same user block
inject a one-time model-facing reminder after the third repeat
preserve diagnostics metadata across tool completion, errors, and shell output updates
store target summaries as non-readable hashes to avoid leaking tool input details

Why

Issue #133 identified sessions where models repeatedly called the same tool input or retried the same failing tool pattern. This keeps V1 small: it observes and nudges the model without blocking tools, asking the user, adding model-specific rules, or changing normal exploration behavior.

Related Issue

Closes #133

How To Verify

cd packages/opencode
bun test test/session/diagnostics.test.ts
bun test test/session/prompt-effect.test.ts --timeout 30000
bun test test/session/processor-effect.test.ts --timeout 30000
bun run typecheck
bun run test

Final full package result:

2053 pass
8 skip
1 todo
0 fail

Also ran:

git diff --check

Screenshots or Recordings

Not applicable. This is harness/session behavior with no visible UI change.

Checklist

I linked the related issue, or stated why there is no issue
This PR has type, scope, and priority labels, or I requested maintainer labeling
I listed the relevant verification steps, including tests when behavior changed
I manually checked visible UI or copy changes when needed, with screenshots or recordings
I considered macOS and Windows impact for desktop, packaging, updater, signing, paths, shell, or permissions changes
I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, or generated/local file changes when relevant
I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Maintainer labeling requested: type fix, scope opencode/session, priority based on issue #133.

Summary by CodeRabbit

Release Notes

New Features
- Enhanced session loop detection automatically identifies and tracks repeated tool calls with identical inputs and errors.
- Diagnostic reminders are injected when repeated patterns are detected, helping prevent infinite loops and unnecessary operations.

coderabbitai · 2026-04-24T03:02:01Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: aae710c3-36ca-416e-9e8e-0645c4fbe323

📥 Commits

Reviewing files that changed from the base of the PR and between d35885d and 3440a8e.

📒 Files selected for processing (5)

packages/opencode/src/session/diagnostics.ts
packages/opencode/src/session/processor.ts
packages/opencode/src/session/prompt.ts
packages/opencode/test/session/diagnostics.test.ts
packages/opencode/test/session/prompt-effect.test.ts

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: smoke-macos-arm64
GitHub Check: unit-windows-opencode-config-project
GitHub Check: unit-windows-opencode-server-tools
GitHub Check: unit-windows-app
GitHub Check: unit-windows-opencode-session
GitHub Check: unit-windows-desktop
GitHub Check: unit-opencode
GitHub Check: unit-app
GitHub Check: unit-desktop
GitHub Check: typecheck
GitHub Check: analyze-js-ts
GitHub Check: e2e-artifacts

🧰 Additional context used

📓 Path-based instructions (2)

packages/opencode/**/*.ts

📄 CodeRabbit inference engine (packages/opencode/AGENTS.md)

packages/opencode/**/*.ts: Use Effect.gen(function* () { ... }) for Effect composition
Use Effect.fn("Domain.method") for named/traced effects and Effect.fnUntraced for internal helpers; these accept pipeable operators as extra arguments to avoid unnecessary outer .pipe() wrappers
Use Effect.callback for callback-based APIs
Prefer DateTime.nowAsDate over new Date(yield* Clock.currentTimeMillis) when you need a Date in Effect code
Use Schema.Class for multi-field data in Effect schemas
Use branded schemas (Schema.brand) for single-value types in Effect
Use Schema.TaggedErrorClass for typed errors in Effect schemas
Use Schema.Defect instead of unknown for defect-like causes in Effect code
In Effect.gen / Effect.fn, prefer yield* new MyError(...) over yield* Effect.fail(new MyError(...)) for direct early-failure branches
Use makeRuntime from src/effect/run-service.ts for all services; it returns { runPromise, runFork, runCallback } backed by a shared memoMap that deduplicates layers
Use InstanceState from src/effect/instance-state.ts for per-directory or per-project state that needs per-instance cleanup; do work directly in the InstanceState.make closure where ScopedCache handles run-once semantics
Use Effect.addFinalizer or Effect.acquireRelease inside the InstanceState.make closure for cleanup (subscriptions, process teardown, etc.)
Use Effect.forkScoped inside the InstanceState.make closure for background stream consumers — the fiber is interrupted when the instance is disposed
Prefer FileSystem.FileSystem instead of raw fs/promises for effectful file I/O in Effect services
Prefer ChildProcessSpawner.ChildProcessSpawner with ChildProcess.make(...) instead of custom process wrappers in Effect services
Prefer HttpClient.HttpClient instead of raw fetch in Effect services
Prefer Path.Path, Config, Clock, and DateTime services when those concerns are already inside Effect code
For backgroun...

Files:

packages/opencode/test/session/prompt-effect.test.ts
packages/opencode/src/session/diagnostics.ts
packages/opencode/test/session/diagnostics.test.ts
packages/opencode/src/session/processor.ts
packages/opencode/src/session/prompt.ts

packages/opencode/test/**/*.test.{ts,tsx}

📄 CodeRabbit inference engine (packages/opencode/test/AGENTS.md)

packages/opencode/test/**/*.test.{ts,tsx}: Use the tmpdir function from fixture/fixture.ts to create temporary directories for tests with automatic cleanup. Use await using syntax to ensure automatic cleanup when the variable goes out of scope.
When using the tmpdir function with git repository support, pass the git: true option to initialize a git repo with a root commit.
Use the config option in tmpdir to write an opencode.json config file during test setup by passing a partial Config.Info object.
Use the init option in tmpdir to define custom setup functions that can return extra data accessible via tmp.extra, and use the dispose option for custom cleanup logic.
Use testEffect(...) from test/lib/effect.ts for tests that exercise Effect services or Effect-based workflows.
Use it.effect(...) when the test should run with TestClock and TestConsole. Use it.live(...) when the test depends on real time, filesystem mtimes, child processes, git, locks, or other live OS behavior.
Prefer Effect-aware helpers from fixture/fixture.ts over building manual runtimes in tests: use tmpdirScoped() for scoped temp directories, provideInstance(dir)(effect) for low-level binding without directory creation, provideTmpdirInstance(...) for single temp instance binding, or provideTmpdirServer(...) for tests that also need the test LLM server.
Define const it = testEffect(...) near the top of the test file and keep the test body inside Effect.gen(function* () { ... }). Yield services directly with yield* MyService.Service or yield* MyTool.
Avoid custom ManagedRuntime, attach(...), or ad hoc run(...) wrappers in Effect tests when testEffect(...) already provides the runtime.
When a test needs instance-local state, prefer provideTmpdirInstance(...) or provideInstance(...) over manual Instance.provide(...) inside Promise-style tests.

Files:

packages/opencode/test/session/prompt-effect.test.ts
packages/opencode/test/session/diagnostics.test.ts

🧠 Learnings (22)

📓 Common learnings

Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 193
File: packages/app/e2e/sidebar/sidebar-leading-slot.spec.ts:5-55
Timestamp: 2026-04-23T15:25:31.118Z
Learning: In Astro-Han/pawwork E2E tests, reaching a real "running" session state requires the `project` fixture (for model bootstrap) plus `llm.wait(1)` orchestration. The bare `sdk` fixture used by `withSession` does not trigger an LLM call even with `agent: "build"` + `system` prompt set via `sdk.session.promptAsync`, so simulating running state is not lightweight in the current test infrastructure.

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Define `const it = testEffect(...)` near the top of the test file and keep the test body inside `Effect.gen(function* () { ... })`. Yield services directly with `yield* MyService.Service` or `yield* MyTool`.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts
packages/opencode/test/session/diagnostics.test.ts
packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Prefer Effect-aware helpers from `fixture/fixture.ts` over building manual runtimes in tests: use `tmpdirScoped()` for scoped temp directories, `provideInstance(dir)(effect)` for low-level binding without directory creation, `provideTmpdirInstance(...)` for single temp instance binding, or `provideTmpdirServer(...)` for tests that also need the test LLM server.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts
packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Use `it.effect(...)` when the test should run with `TestClock` and `TestConsole`. Use `it.live(...)` when the test depends on real time, filesystem mtimes, child processes, git, locks, or other live OS behavior.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Use `testEffect(...)` from `test/lib/effect.ts` for tests that exercise Effect services or Effect-based workflows.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts
packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Avoid custom `ManagedRuntime`, `attach(...)`, or ad hoc `run(...)` wrappers in Effect tests when `testEffect(...)` already provides the runtime.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : When a test needs instance-local state, prefer `provideTmpdirInstance(...)` or `provideInstance(...)` over manual `Instance.provide(...)` inside Promise-style tests.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts

📚 Learning: 2026-04-22T09:32:54.556Z

Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 126
File: packages/opencode/test/provider/provider.test.ts:64-85
Timestamp: 2026-04-22T09:32:54.556Z
Learning: In `packages/opencode/test/provider/provider.test.ts`, the file intentionally uses AppRuntime-based async helpers (`run`, `list`, `getProvider`, etc.) rather than `testEffect(...)` for all tests. Converting individual tests to `testEffect` while leaving the rest on the async pattern would create internal inconsistency. A full harness migration of this file is the right approach if the pattern needs to change, but that should be a separate PR.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Use the `tmpdir` function from `fixture/fixture.ts` to create temporary directories for tests with automatic cleanup. Use `await using` syntax to ensure automatic cleanup when the variable goes out of scope.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts

📚 Learning: 2026-04-20T14:36:04.113Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/app/e2e/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:04.113Z
Learning: Applies to packages/app/e2e/packages/app/src/testing/**/*.ts : Test-only hooks must be inert unless explicitly enabled and should not add normal-runtime listeners, reactive subscriptions, or per-update allocations

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts
packages/opencode/test/session/diagnostics.test.ts

📚 Learning: 2026-04-23T15:25:31.118Z

Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 193
File: packages/app/e2e/sidebar/sidebar-leading-slot.spec.ts:5-55
Timestamp: 2026-04-23T15:25:31.118Z
Learning: In Astro-Han/pawwork E2E tests, reaching a real "running" session state requires the `project` fixture (for model bootstrap) plus `llm.wait(1)` orchestration. The bare `sdk` fixture used by `withSession` does not trigger an LLM call even with `agent: "build"` + `system` prompt set via `sdk.session.promptAsync`, so simulating running state is not lightweight in the current test infrastructure.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts

📚 Learning: 2026-04-20T14:36:31.032Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/test/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:31.032Z
Learning: Applies to packages/opencode/test/**/*.test.{ts,tsx} : Use the `init` option in `tmpdir` to define custom setup functions that can return extra data accessible via `tmp.extra`, and use the `dispose` option for custom cleanup logic.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts

📚 Learning: 2026-04-23T08:51:00.819Z

Learnt from: Astro-Han
Repo: Astro-Han/pawwork PR: 186
File: packages/opencode/test/plugin/workspace-adaptor.test.ts:139-144
Timestamp: 2026-04-23T08:51:00.819Z
Learning: For pawwork tests under packages/opencode/test/**, auth.json teardown may intentionally combine `Filesystem.write` (from `packages/opencode/src/util/filesystem.ts`) with `node:fs/promises` `unlink` for cleanup. Do not flag this as inconsistent style; it is the established/intentional pattern because `Filesystem` does not provide a `remove`/`unlink` helper.

Applied to files:

packages/opencode/test/session/prompt-effect.test.ts
packages/opencode/test/session/diagnostics.test.ts

📚 Learning: 2026-04-20T14:36:04.113Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/app/e2e/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:04.113Z
Learning: Applies to packages/app/e2e/**/*.spec.ts : Call `project.trackSession(sessionID, directory?)` and `project.trackDirectory(directory)` for any resources created outside the fixture so teardown can clean them up

Applied to files:

packages/opencode/test/session/diagnostics.test.ts

📚 Learning: 2026-04-20T14:36:04.113Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/app/e2e/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:04.113Z
Learning: Applies to packages/app/e2e/**/*.spec.ts : Use fixture-managed cleanup with `withSession(sdk, title, callback)` for temporary sessions instead of calling `sdk.session.delete(...)` directly

Applied to files:

packages/opencode/test/session/diagnostics.test.ts

📚 Learning: 2026-04-20T14:36:21.288Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : Use `Effect.fn("Domain.method")` for named/traced effects and `Effect.fnUntraced` for internal helpers; these accept pipeable operators as extra arguments to avoid unnecessary outer `.pipe()` wrappers

Applied to files:

packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:21.288Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : Prefer `Path.Path`, `Config`, `Clock`, and `DateTime` services when those concerns are already inside Effect code

Applied to files:

packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:21.288Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : Use `Effect.gen(function* () { ... })` for Effect composition

Applied to files:

packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:21.288Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : Use `Effect.cached` when multiple concurrent callers should share a single in-flight computation rather than storing `Fiber | undefined` or `Promise | undefined` manually

Applied to files:

packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:21.288Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : Prefer `FileSystem.FileSystem` instead of raw `fs/promises` for effectful file I/O in Effect services

Applied to files:

packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:21.288Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : Use `Effect.callback` for callback-based APIs

Applied to files:

packages/opencode/src/session/processor.ts

📚 Learning: 2026-04-20T14:36:21.288Z

Learnt from: CR
Repo: Astro-Han/pawwork PR: 0
File: packages/opencode/AGENTS.md:0-0
Timestamp: 2026-04-20T14:36:21.288Z
Learning: Applies to packages/opencode/**/*.ts : For background loops or scheduled tasks, use `Effect.repeat` or `Effect.schedule` with `Effect.forkScoped` in the layer definition

Applied to files:

packages/opencode/src/session/processor.ts

🔇 Additional comments (21)

packages/opencode/src/session/diagnostics.ts (5)

64-66: LGTM: Hash function is appropriate for diagnostics fingerprinting.

The 64-bit truncated SHA-256 hash provides sufficient collision resistance for tracking tool input/error patterns within a session context.

68-72: LGTM: Input normalization correctly handles semantic equivalence.

The function filters non-semantic keys (request IDs, trace IDs, nonces), sorts object keys for stable serialization, and trims strings. This ensures semantically equivalent inputs produce identical hashes.

80-97: LGTM: Error fingerprinting normalizes variable components effectively.

The regex-based normalization replaces URLs, quoted strings, paths, hex IDs, and numbers with placeholders, ensuring equivalent error classes produce the same fingerprint regardless of specific values.

231-287: LGTM: consumeReminders correctly marks pending reminders as injected and generates appropriate nudge text.

The function properly:

Filters for assistant messages matching the parentID

Transitions pending reminders to "injected" status with timestamp

Returns updated tool parts for persistence

Generates differentiated text for input vs error repeats

212-229: No action needed; current behavior is intentional.

The spread-based merge at the loop level does replace the entire reminders array, but this is not a bug. All usages follow a pattern where reminders are either freshly generated or explicitly managed before mergeMetadata is called. The function is not designed for accumulating reminders across multiple merges; responsibility for reminder management is delegated to the callers.

packages/opencode/test/session/prompt-effect.test.ts (1)

494-540: LGTM: Comprehensive integration test for diagnostics reminder injection.

The test correctly validates the end-to-end flow:

Three identical tool calls trigger the repeat detection

The diagnostics reminder text is injected into the next LLM request

Tool metadata is persisted with proper inputRepeatCount and input_repeat reminder state

The test follows project conventions using it.live, Effect.gen, and provideTmpdirServer fixture.

packages/opencode/src/session/prompt.ts (4)

38-38: LGTM: Import added for SessionDiagnostics integration.

887-887: LGTM: Shell metadata updates now preserve existing diagnostics.

The spread pattern { ...part.state.metadata, output, description: "" } correctly preserves any existing diagnostic metadata while updating the output field. This aligns with the mergeMetadata contract from SessionDiagnostics.

Also applies to: 905-905

1399-1399: LGTM: Aborted shell completion preserves diagnostic metadata.

Consistent with the other shell metadata updates, the spread pattern ensures diagnostics are not lost when finalizing an aborted shell command.

1514-1529: LGTM: Diagnostics consumption correctly integrated into the run loop.

The implementation:

Consumes pending reminders after other reminder insertions

Appends synthetic diagnostic text to the user message for model input (in-memory only)

Persists the updated tool parts with reminders marked as "injected"

The synthetic text part is correctly added only to the in-memory msgs array used for model input, not persisted, ensuring the nudge is model-facing only as intended by the PR objectives.

packages/opencode/src/session/processor.ts (5)

167-174: LGTM: Helper functions for extracting tool state metadata.

Clean extraction of metadata and diagnostics from tool parts with proper type guards using isRecord.

176-217: LGTM: Record collection functions properly scope diagnostics to user blocks.

Both loopRecords and errorRecords correctly filter by parentID to ensure repeat tracking is scoped to the current user message block, preventing false positives across different conversation turns.

230-243: LGTM: Tool completion preserves diagnostics metadata.

The completeToolCall function correctly extracts existing diagnostics from the running state and merges them into the completed state using SessionDiagnostics.mergeMetadata.

246-273: LGTM: Tool failure observes error patterns and preserves diagnostics.

The failToolCall function:

Observes the error pattern via SessionDiagnostics.observeToolError when parentID exists

Falls back to existing diagnostics when no parentID

Merges diagnostics into the error state metadata

This ensures error repeat detection works correctly while preserving any existing diagnostic data.

350-383: LGTM: Tool call event handler integrates diagnostics observation.

The handler correctly:

Updates the tool part to running state first

Guards against missing parentID before observing

Fetches session info to get parentSessionID for subagent tracking

Observes the tool call and merges diagnostics into the running state

The sequential flow ensures the tool part exists before diagnostics are attached.

packages/opencode/test/session/diagnostics.test.ts (6)

1-17: LGTM: Well-structured test setup for SessionDiagnostics.

The test file properly imports dependencies, creates test fixtures, and defines a helpful loop() extraction function for cleaner assertions.

18-40: LGTM: Input normalization tests verify semantic stability.

The tests correctly verify:

Reordered keys and filtered non-semantic fields produce stable hashes

Semantic fields like cursor produce different hashes when values differ

42-144: LGTM: Comprehensive observeToolCall test coverage.

The tests verify:

Reminder creation only on the third repeat (not second, not fourth)

Cross-block isolation prevents false positives

Different URLs in exploratory patterns don't trigger reminders

This covers the key behaviors of the repeat detection logic.

146-173: LGTM: Error normalization test validates fingerprinting.

The test correctly verifies that equivalent error messages with different numeric values (positions 12, 18, 44) are consolidated into a single error class, triggering one reminder on the third occurrence.

175-217: LGTM: Metadata helper tests verify merging and privacy.

The tests correctly verify:

mergeMetadata preserves existing tool metadata while adding diagnostics

targetSummary hashes sensitive values without exposing them in the summary string

219-291: LGTM: consumeReminders test verifies injection and idempotence.

The test correctly verifies:

Pending reminders are marked as "injected" with the provided timestamp

Reminder text is generated for the model

Second consumption produces no reminders (idempotent)

📝 Walkthrough

Walkthrough

A new SessionDiagnostics module is introduced to track tool call and error patterns by computing stable input/error fingerprints and generating reminders when repeats are detected. The session processor and prompt handler are updated to integrate this diagnostics system, observing tool calls/errors, consuming pending reminders, and preserving metadata during state transitions.

Changes

Cohort / File(s)	Summary
SessionDiagnostics Core `packages/opencode/src/session/diagnostics.ts`	New module implementing stable fingerprint computation (SHA-256), metadata merging, tool-call/error observation with repeat tracking, reminder consumption from prior messages, and synthetic reminder text generation.
Integration Points `packages/opencode/src/session/processor.ts`, `packages/opencode/src/session/prompt.ts`	Updated to instrument tool calls/errors via `SessionDiagnostics`, extract and merge diagnostic metadata into tool parts, and append diagnostic reminder text to message content when repeats are detected.
Test Coverage `packages/opencode/test/session/diagnostics.test.ts`, `packages/opencode/test/session/prompt-effect.test.ts`	Unit tests validating fingerprinting, repeat tracking, metadata utilities, and reminder consumption; integration test confirming loop diagnostics reminders are injected into LLM requests with correct metadata.

Sequence Diagram

sequenceDiagram
    participant PromptLoop as Prompt Loop
    participant SessionProc as SessionProcessor
    participant Diagnostics as SessionDiagnostics
    participant AssistantMsg as Prior Messages
    participant MetadataMgr as Part Metadata

    PromptLoop->>SessionProc: Start/Resume tool call
    SessionProc->>AssistantMsg: Fetch prior assistant messages
    AssistantMsg-->>SessionProc: Return tool part records
    SessionProc->>Diagnostics: observeToolCall(sessionID, tool, input, records)
    Diagnostics->>Diagnostics: Compute input fingerprint<br/>Increment repeat count
    Diagnostics-->>SessionProc: Return observed diagnostics
    SessionProc->>MetadataMgr: Merge diagnostics into part metadata
    
    PromptLoop->>Diagnostics: consumeReminders(messages, parentID)
    Diagnostics->>AssistantMsg: Scan for matching pending reminders
    Diagnostics->>MetadataMgr: Mark reminders as injected
    Diagnostics-->>PromptLoop: Return reminder text + updated messages
    
    PromptLoop->>PromptLoop: Append reminder text to user message
    PromptLoop->>PromptLoop: Persist updated diagnostic parts

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested labels

P2, app

Poem

🐰 A curious loop the rabbit did track,
With fingerprints hashed and reminders brought back,
When tools call and repeat in the same session's dance,
The diagnostics whisper, "Let's try something else, perchance!"
Round and round, yet aware—no more getting stuck,
Just diagnostics and metadata bringing us luck! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: add session loop diagnostics' clearly and concisely summarizes the main change—adding diagnostic functionality for session loop detection.
Description check	✅ Passed	The PR description fully covers all required template sections: summary of changes, problem/goal explanation, related issue link, verification steps with test commands, and checklist completion.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/feat-i133-diagnostics

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a new SessionDiagnostics module designed to detect and handle 'doom loops' by tracking repeated tool inputs and error fingerprints. It replaces the previous threshold logic in SessionProcessor with a more sophisticated hashing and normalization approach, injecting system reminders when repetitions are detected. Additionally, the PR refactors service dependencies in the session processor and includes new unit and integration tests. I have no feedback to provide as there were no review comments to assess.

fix: add session loop diagnostics

3440a8e

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

Astro-Han added enhancement New feature or request P1 High priority harness Model harness, prompts, tool descriptions, and session mechanics labels Apr 24, 2026

Astro-Han merged commit da20c1e into dev Apr 24, 2026
26 checks passed

Astro-Han deleted the codex/feat-i133-diagnostics branch April 24, 2026 03:17

This was referenced Apr 28, 2026

fix(session): sync session edge fixes #308

Merged

feat(session): gate repeated successful tool calls #406

Merged

fix(session): relax successful same-target loop gate #416

Merged

coderabbitai Bot mentioned this pull request May 12, 2026

feat(opencode): observability for session export (abort provenance + title trace + release version) #563

Merged

11 tasks

Astro-Han mentioned this pull request May 19, 2026

[Task] sync upstream opencode provider fixes (12 items, post-fork 2026-04-14) #743

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add session loop diagnostics#204

fix: add session loop diagnostics#204
Astro-Han merged 1 commit into
devfrom
codex/feat-i133-diagnostics

Astro-Han commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Astro-Han commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Related Issue

How To Verify

Screenshots or Recordings

Checklist

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Astro-Han commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading