Skip to content

fix(runtime-fallback): harden delegated fallback retry flow#4223

Open
andomeder wants to merge 2 commits into
code-yeongyu:devfrom
andomeder:wip/pr2-harden-delegated-runtime-fallback
Open

fix(runtime-fallback): harden delegated fallback retry flow#4223
andomeder wants to merge 2 commits into
code-yeongyu:devfrom
andomeder:wip/pr2-harden-delegated-runtime-fallback

Conversation

@andomeder

@andomeder andomeder commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a reproduced runtime-fallback failure chain in delegated child sessions:

  • immediate quota/retry statuses could advance fallback incorrectly
  • object-shaped/top-level message and event payloads were not normalized consistently
  • sync delegated retries could be dropped behind prompt-gate reservations
  • empty assistant completion rows could suppress fallback without producing a usable result
  • delegated child retries could lose their bootstrap prompt before a real user turn was persisted
  • fallback replays could create a second visible opening prompt

Changes

  • trigger runtime fallback directly on immediate provider retry/quota signals instead of treating them as successful progress
  • normalize object-shaped and top-level runtime-fallback/session-message payloads before fallback state, visibility checks, and sync completion checks
  • queue delegated sync fallback retries behind the active prompt reservation instead of dropping them as reserved
  • ignore empty assistant completion markers unless they carry meaningful payload
  • ignore synthetic/internal retry turns when rebuilding fallback retry payloads
  • preserve delegated child bootstrap retry parts until a real persisted user turn exists
  • redispatch fallback retry prompts as internal-only prompts
  • mark internal retry text parts as synthetic: true so they stay hidden in history instead of appearing as a second user prompt
  • cancel stale queued runtime-fallback prompts after successful retry progression
  • add regression coverage for the reproduced hang and duplicate-prompt paths

Testing

  • bun test src/hooks/runtime-fallback/auto-retry.test.ts src/hooks/runtime-fallback/index.test.ts src/hooks/runtime-fallback/message-update-handler.test.ts src/tools/delegate-task/sync-session-poller.test.ts
  • bun test src/shared/internal-initiator-marker.test.ts src/hooks/runtime-fallback/auto-retry.test.ts src/hooks/runtime-fallback/index.test.ts src/hooks/runtime-fallback/message-update-handler.test.ts src/tools/delegate-task/sync-session-poller.test.ts src/tools/delegate-task/sync-prompt-sender.test.ts src/shared/model-suggestion-retry.test.ts src/plugin/chat-message.test.ts
  • bun run build

Related Issues


Summary by cubic

Strengthens delegated runtime fallback and retries to prevent hangs, duplicate prompts, and dropped retries. Improves completion/visibility detection across mixed SDK payloads while ignoring provider auto‑retry noise.

  • Bug Fixes
    • Treat immediate provider retry/quota statuses as fallback triggers (no false progress); normalize object-shaped models into provider/model strings.
    • Normalize message responses (raw arrays, data, or messages) before checks; ignore provider auto‑retry text in visibility checks; count reasoning-only and token/cost-based completions as visible; ignore empty finish markers.
    • Queue sync delegated retries behind an active prompt reservation; enqueue runtime fallback prompts and cancel stale queued runtime-fallback: prompts when not accepted.
    • Dispatch fallback retry prompts as internal-only text parts (synthetic: true) to keep them hidden and avoid a second visible user prompt.
    • Build retry payloads from the last real user turn (ignore synthetic/internal retries); keep delegated child bootstrap prompt until a real user turn is persisted.
    • Sync poller: detect completion via top-level time.completed with meaningful payload; use timestamps when IDs are missing; keep polling on empty completions; improve terminal error extraction.
    • Error classification: recognize “Free usage exceeded, subscribe to Go” as quota-exceeded.

Written for commit b2ef800. Summary will update on new commits. Review in cubic

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 19 files

Confidence score: 3/5

  • There is a concrete regression risk in src/hooks/runtime-fallback/visible-assistant-response.ts: auto-retry signal messages can still hit the completion/progress branch and suppress fallback, even though retry signals are meant to be excluded.
  • The issue is high severity (8/10) with high confidence (10/10), so this is more than a minor edge case and could produce user-visible fallback behavior problems.
  • Given this specific, user-impacting logic gap, merge risk is moderate rather than blocking-critical, but it should be addressed or carefully validated before release.
  • Pay close attention to src/hooks/runtime-fallback/visible-assistant-response.ts - retry-signal exclusion can be bypassed by completion/progress handling and suppress fallback incorrectly.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread src/hooks/runtime-fallback/visible-assistant-response.ts Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 2 files (changes from recent commits).

Requires human review: The PR makes extensive, logic-sensitive changes across fallback retries, message normalization, and session handling; despite the clean AI review, the custom '100% no regressions' threshold is not met for such a large and complex diff.

Re-trigger cubic

@code-yeongyu

Copy link
Copy Markdown
Owner

[sisyphus-bot] Hi andomeder. 🙏 Thanks for the very thorough work here. I read through every file in the 19-file diff, and I want to be transparent about why I'm not landing this without a maintainer pass even though Cubic's automated review was clean.

What I checked and liked

  • src/shared/prompt-async-gate.ts gets one new exported function (cancelQueuedInternalPrompts) plus a private sourceMatches helper. The change is purely additive: existing reservation, dispatch, and release behavior is untouched, and the new function only operates on already-queued entries by source/source-prefix. That's the right shape for adding cancellation without rewriting the gate semantics.
  • src/shared/internal-initiator-marker.ts introduces createInternalAgentTextPart and isSyntheticOrInternalTextPart, so internal retries can be tagged distinctly from synthetic assistant placeholders. That distinction lets visible-assistant-response.ts and session-status-handler.ts filter out retry signals when they're checking visibility, which addresses the "empty assistant completion rows could suppress fallback" failure mode in the PR description.
  • The new error-classifier.ts and auto-retry-signal.test.ts are clean factor-outs. Splitting "is this an immediate-quota error?" / "is this a retryable runtime error?" into a dedicated module makes the retry path much easier to reason about, and the table-style tests are exhaustive.
  • Test coverage is genuinely strong: auto-retry.test.ts, message-update-handler.test.ts (+116), session-status-handler.test.ts (+74), sync-session-poller.test.ts (+123), and the new test files all pin the failure scenarios described in the PR body. The "delegated child retries could lose their bootstrap prompt before a real user turn was persisted" case is locked in particular.
  • fallback-bootstrap-model.ts and last-user-retry-parts.ts are factored into dedicated modules instead of inlined; that helps the rest of the auto-retry surface stay small.

Why I want a maintainer eye before merge

  • src/shared/prompt-async-gate.ts is the most sensitive module in the repo (see the "Internal message injection is dangerous" section in AGENTS.md). Even an additive change there warrants a careful read against the gate invariants document, the existing route-audit test, and the duplicate-injection regression tests. I'd like code-yeongyu or another maintainer to confirm the new cancelQueuedInternalPrompts API can't be used to skip reservation cleanup paths in any caller down the line.
  • The auto-retry.ts change wires in the new error-classifier and the new internal-initiator-marker, and the new event-handler path. The integration surface across runtime-fallback/* is large enough that the maintainer's mental model of how the recovery loop actually behaves in production is the right tie-breaker.
  • Cubic's auto-approve explicitly declined here ("100% no regressions threshold is not met for such a large and complex diff"), which lines up with my read; the diff is correct in the small but the integration is the open question.

Concrete observations (mostly non-blocking, one worth confirming)

  • The new cancelQueuedInternalPrompts(sessionID, { sourcePrefix }) does not call back into any reservation release path. Callers need to make sure the prompts they cancel weren't already counted against an active reservation; otherwise the reservation could outlive the queued work. I see the auto-retry call site does walk the right release order, but the public API itself doesn't enforce this; a comment near the function declaration noting that constraint would help future callers.
  • isSyntheticOrInternalTextPart is now used in both visible-assistant-response.ts and session-status-handler.ts. Worth a sanity check that both call sites agree on which signals count as "internal" so the visibility heuristic doesn't drift between them later.
  • The PR is on top of a base that's now CONFLICTING against dev. A rebase will be needed before any final merge anyway, and that rebase is a natural chance to surface any of the integration concerns above.

Going to leave this for a maintainer to take the rest of the way. Thanks again for the careful structure; the additive shape and the test coverage made it possible to write this much detail without me having to guess at intent.

This was referenced May 21, 2026
PeterPonyu added a commit to PeterPonyu/oh-my-openagent that referenced this pull request May 22, 2026
…n internal markers

Design: the same OMO_INTERNAL_INITIATOR_MARKER token serves two
semantically distinct contexts that MUST stay separate.

VISIBLE markers (createInternalAgentTextPart, no synthetic flag):
PR code-yeongyu#4003 agent-handoff identity markers — the agent MUST read its new
identity from the chat stream. If these were synthetic, the LLM would
be blind to its own role change and fail to switch identities.

HIDDEN markers (createInternalAgentContinuationTextPart, synthetic: true):
PR code-yeongyu#4223 internal retry/compaction prompts — invisible to the LLM as
user input so they do not pollute the agent's perceived conversation.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@code-yeongyu code-yeongyu force-pushed the dev branch 2 times, most recently from eb25d29 to 2bfad49 Compare May 23, 2026 16:00
PeterPonyu added a commit to PeterPonyu/oh-my-openagent that referenced this pull request May 24, 2026
…n internal markers

Design: the same OMO_INTERNAL_INITIATOR_MARKER token serves two
semantically distinct contexts that MUST stay separate.

VISIBLE markers (createInternalAgentTextPart, no synthetic flag):
PR code-yeongyu#4003 agent-handoff identity markers — the agent MUST read its new
identity from the chat stream. If these were synthetic, the LLM would
be blind to its own role change and fail to switch identities.

HIDDEN markers (createInternalAgentContinuationTextPart, synthetic: true):
PR code-yeongyu#4223 internal retry/compaction prompts — invisible to the LLM as
user input so they do not pollute the agent's perceived conversation.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants