Skip to content

fix(app): add fallback for lost question.asked SSE events#379

Merged
Astro-Han merged 2 commits into
devfrom
pawwork/question-dock-sync-fix
May 2, 2026
Merged

fix(app): add fallback for lost question.asked SSE events#379
Astro-Han merged 2 commits into
devfrom
pawwork/question-dock-sync-fix

Conversation

@Astro-Han

@Astro-Han Astro-Han commented May 2, 2026

Copy link
Copy Markdown
Owner

Summary

Add a self-healing fallback in session-composer-state that detects running question tool parts in message parts when the questionRequest() memo is empty, and re-fetches the pending question list from the backend via sdk.client.question.list().

Why

The question dock relies exclusively on sync.data.question to render, which is populated by two paths:

  1. Bootstrap: sdk.question.list() called once at startup
  2. SSE events: question.asked real-time push

If the SSE event is lost (stream disconnect, transient error, reconnect gap), the question data never enters the frontend store. The dock never renders, the model waits forever for an answer, and the session appears stuck — even though the backend question is still pending and the message parts already contain the question tool in running state.

This was observed in a real session where the model called the question tool, the backend dispatched it (state = running), but the frontend never showed the question dock. The session froze with no way to recover.

Related Issue

None.

Human Review Status

Pending.

Review Focus

  • The fallback effect logic in session-composer-state.ts lines 32–89: correct detection of running question parts, proper guard against re-fetch loops, correct store update via sync.set + reconcile
  • Whether scanning the last 5 messages is sufficient and performant

Risk Notes

  • The fallback only activates when there is a mismatch (running question part exists but no question request in store). It is a no-op under normal conditions.
  • questionRefetchInflight prevents duplicate fetches but is module-scoped (shared across session switches). This is intentional — re-fetches are rare and a single inflight guard is sufficient.
  • The fix does not address the root cause (SSE event reliability). A follow-up should consider event replay on reconnect.

How To Verify

Typecheck: passes (npx tsc --noEmit --project packages/app/tsconfig.json)
Existing tests: 7/7 pass (session-composer-state.test.ts)
Event reducer tests: 13/13 pass (event-reducer.test.ts)
Manual: the fallback activates when a running question tool part exists but questionRequest() is undefined

Screenshots or Recordings

Not applicable — no visible UI change. The fix ensures the question dock appears when it should.

Checklist

  • Human review status is stated above as pending
  • This PR has type, scope, and priority labels, or I requested maintainer labeling
  • I described the review focus and any meaningful risks
  • I listed the relevant verification steps and the key result for each
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope
  • I considered macOS and Windows impact — no platform-specific changes
  • I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant — none
  • I reviewed the final diff for unrelated changes and suspicious dependency changes
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English

Architectural Follow-up

This fix treats the symptom (missing question data) at the UI layer. The root cause is that the SSE event stream has no replay/recovery mechanism. Consider:

  1. Event replay on reconnect: client sends last-seen event ID, server replays missed events. This fixes the problem for all event types (question, permission, session status, etc.)
  2. Parts as source of truth: message parts are persisted and loaded via message sync (not SSE-dependent). The question dock could derive its state from parts directly, using events only as a "new data available" signal.
  3. Polling fallback at the sync layer: rather than each UI component adding its own fallback, the sync layer could periodically reconcile its event-driven state against the backend API.

Summary by CodeRabbit

  • Bug Fixes
    • Improved session composer stability: automatically detects and recovers missing or incomplete question data by refetching pending questions, preventing potential data loss and ensuring questions remain available during session composition. Refetching is protected against duplicate requests and silently handles transient errors to avoid user disruptions.

When the question.asked SSE event is lost (stream disconnect, transient
error), the question dock never appears because sync.data.question has
no entry for the session. The backend question remains pending, the
model waits forever, and the user sees a stuck session.

Add a fallback effect in session-composer-state that detects running
question tool parts in message parts when questionRequest() is empty,
and re-fetches the pending question list from the backend.
@coderabbitai

coderabbitai Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f68f8eb1-464d-4ded-9b70-a18c57e2cf38

📥 Commits

Reviewing files that changed from the base of the PR and between c136efa and a95b823.

📒 Files selected for processing (1)
  • packages/app/src/pages/session/composer/session-composer-state.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/app/src/pages/session/composer/session-composer-state.ts

📝 Walkthrough

Walkthrough

A defensive fallback was added to createSessionComposerState: when a question tool part is running but questionRequest() is absent, it refetches pending questions via sdk.client.question.list(), filters and groups valid QuestionRequest objects, and reconciles them into sync.

Changes

Defensive Question Refetch

Layer / File(s) Summary
Imports
packages/app/src/pages/session/composer/session-composer-state.ts
Added batch and reconcile imports from solid-js/store.
Detection Effect
packages/app/src/pages/session/composer/session-composer-state.ts
New createEffect watches for recent message parts containing a question tool part in running state while questionRequest() is missing.
Async Refetch
packages/app/src/pages/session/composer/session-composer-state.ts
Effect triggers guarded async call to sdk.client.question.list(), filters results to valid QuestionRequests, partitions them by sessionID, sorts by id, and writes them into sync using reconcile.
Concurrency & Error Handling
packages/app/src/pages/session/composer/session-composer-state.ts
Refetch is protected by an in-flight boolean to avoid overlap; errors are swallowed and the guard is cleared in .finally().
sequenceDiagram
  participant Composer as SessionComposerState
  participant SDK as SDK Client
  participant Sync as Sync Store

  Composer->>Composer: detect running question part without questionRequest
  Composer->>SDK: sdk.client.question.list()
  SDK-->>Composer: list of pending questions
  Composer->>Sync: reconcile grouped/sorted QuestionRequests by id
  Sync-->>Composer: state updated
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

P2, ui

Poem

🐰 A question ran off, hidden from view,
I listened, I fetched, returned a fresh crew.
Grouped by their sessions, sorted by id,
Reconciled gently — now each one sits tidy.
Hop, patch, and stitch — stable state anew.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding a fallback mechanism for lost question.asked SSE events, which matches the core purpose of the pull request.
Description check ✅ Passed The description is comprehensive and follows the template structure with all major sections completed: Summary, Why, Review Focus, Risk Notes, How To Verify, and Checklist; it provides clear context about the problem, solution approach, and verification steps.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pawwork/question-dock-sync-fix

Review rate limit: 6/10 reviews remaining, refill in 20 minutes and 28 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@Astro-Han Astro-Han added bug Something isn't working app Application behavior and product flows P1 High priority labels May 2, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a fallback mechanism to handle lost SSE events by re-fetching pending questions when a "question" tool is detected in a running state without a corresponding request. Feedback suggests removing the defer: true option to ensure the check runs on component mount, refining the scope and logic of the refetch flag to prevent cross-session blocking, and using a more robust sort comparator for question IDs.

Comment thread packages/app/src/pages/session/composer/session-composer-state.ts
Comment thread packages/app/src/pages/session/composer/session-composer-state.ts
Comment thread packages/app/src/pages/session/composer/session-composer-state.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/app/src/pages/session/composer/session-composer-state.ts`:
- Around line 55-85: The current refetch handler using
sdk.client.question.list() swallows errors and only clears
questionRefetchInflight, leaving state stuck; modify the catch block for the
anonymous subscription callback (the function keyed by sessionID that sets
questionRefetchInflight) to schedule a bounded retry/backoff or re-arm a
reactive retry trigger keyed by the same sessionID: on failure increment a small
retry counter or use setTimeout with exponential backoff (capped attempts),
ensure questionRefetchInflight is false between attempts, and re-invoke the same
fetch logic (the sdk.client.question.list() flow that updates
sync.set("question", sid, reconcile(...))) until success or max retries;
reference variables/functions: sdk.client.question.list,
questionRefetchInflight, sessionID, and the batch/reconcile sync.set logic so
the retry re-runs the same update path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: eee73424-bcec-4df1-91e0-ba4803d11f4f

📥 Commits

Reviewing files that changed from the base of the PR and between 0d8cae9 and c136efa.

📒 Files selected for processing (1)
  • packages/app/src/pages/session/composer/session-composer-state.ts

Comment thread packages/app/src/pages/session/composer/session-composer-state.ts
When the component mounts with a stuck question (e.g. after page refresh),
defer:true prevents the fallback from detecting it until a subsequent state
change. Removing it allows immediate detection.

@Astro-Han Astro-Han left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review response

#1 (high): Remove defer: true — Fixed in a95b823

Good catch. Without defer, the fallback now runs on mount, covering the page-refresh scenario where SSE is always disconnected.

#2 (medium): SessionID inflight tracking — Won't fix (P3, YAGNI)

Having two sessions simultaneously stuck on question tools requires the model to call question in two separate sessions at the same time, which is architecturally impossible (question blocks the turn until answered). A boolean flag is sufficient.

#3 (medium): Sort comparator completeness — Won't fix (P3, YAGNI)

QuestionIDs are unique ascending timestamps. Equality never occurs in this domain. The comparator is correct for the data.

#4 (major): Retry on failed fetch — Won't fix for now (P2, double-failure)

This would require both the SSE event AND the fallback HTTP request to fail independently. Even if it happens, any user interaction (sending a message, switching sessions) changes reactive dependencies and naturally retriggers the effect. Adding retry/backoff logic for this extreme case would violate YAGNI — can revisit if actually reported.

@Astro-Han Astro-Han merged commit e97d08c into dev May 2, 2026
26 of 27 checks passed
@Astro-Han Astro-Han deleted the pawwork/question-dock-sync-fix branch May 2, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app Application behavior and product flows bug Something isn't working P1 High priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant