fix(providers,workflows): treat Claude SDK stop_sequence success as success (#1425) by Wirasm · Pull Request #1662 · coleam00/Archon

Wirasm · 2026-05-13T07:45:28Z

Summary

Problem: Claude Agent SDK's SDKResultSuccess declares is_error: boolean (not literal false). Stop_sequence terminations carry is_error: true + subtype: 'success' — the SDK's encoding of "non-default termination, but not a failure". The claude provider forwarded this to MessageChunk.isError verbatim, so three downstream consumers (dag-executor main, dag-executor loop, orchestrator-agent direct chat) misclassified clean terminations as node failures with the contradictory user-hostile error "Node '<X>' failed: SDK returned success" — even though the AI had completed correctly and written its output.
Why it matters: Three reporters across v0.3.9 and v0.3.11 (latest 2026-05-13). Review-classify and synthesis pipelines, plus loop-back wrapper workflows, hard-fail and skip downstream trigger-rule joins (synthesize, self-fix, simplify, report). Users blocked unless they fork+patch+rebuild; archon's workflow schema does not expose stop_sequences and disabling output_format would break downstream when: conditions.
What changed: Provider boundary normalises is_error: true + subtype: 'success' as a clean success — no isError propagation downstream; debug-level claude.result_success_stop_sequence log preserves observability. Defense-in-depth errorSubtype !== 'success' guards added to dag-executor.ts main path and loop branch so a future third-party IAgentProvider forwarding the SDK pair raw can't reintroduce the bug. Regression tests for provider, DAG main, DAG loop, and orchestrator direct chat; guard test ensures real error subtypes (error_max_turns) still propagate.
What did NOT change (scope boundary): No change to MessageChunk type shape in packages/providers/src/types.ts (contract semantics tighten, type signature unchanged). No change to Codex turn.failed path (Bug A in the issue is already resolved by commit 4c6ddd99 + the current Codex turn.failed handler at codex/provider.ts:235-247). No change to orchestrator-agent.ts code (the provider-level fix covers direct chat transparently). No retry/auth-precheck behavioural improvements. No touch on related but separate bugs fix: interactive loop resume crashes with error_during_execution (stale session) #1208, bug(codex): retry loop reuses caller AbortSignal; crash on attempt N poisons attempt N+1 #1266.

UX Journey

Before

User                          Archon                                Claude SDK
────                          ──────                                ──────────
runs workflow with        ──▶  classify node executes
output_format: json_schema     SDK returns {is_error: true,
(implies stop_sequence)        subtype: 'success',
                               stop_reason: 'stop_sequence'}  ◀──── completes via stop sequence
                               provider forwards isError verbatim
                               dag-executor.ts:921 throws:
                                 "Node 'classify' failed:
                                  SDK returned success"
                               workflow exits with failure
                               trigger-rule joins skip
                               (synthesize/self-fix/simplify/report)
sees ❌ on a run whose  ◀──── output was written to disk;
classify output is valid       review pipeline produces nothing

After

User                          Archon                                Claude SDK
────                          ──────                                ──────────
runs workflow with        ──▶  classify node executes
output_format: json_schema     SDK returns {is_error: true,
                               subtype: 'success',
                               stop_reason: 'stop_sequence'}  ◀──── completes via stop sequence
                               [+] provider normalises:
                                   is_error + 'success' subtype
                                   ⇒ clean result, no isError
                                   [+] debug log: claude.result_success_stop_sequence
                               dag-executor.ts:921 guard:
                                   `isError && errorSubtype !== 'success'` ⇒ skip throw
                               node completes normally
                               trigger-rule joins fire
                               downstream nodes run on real output
sees ✅ classify output      ◀── review pipeline writes findings

Architecture Diagram

Before

@archon/providers
└── claude/provider.ts
    └── result handler: forwards SDK is_error verbatim
        ──▶ MessageChunk { isError: true, errorSubtype: 'success' }
                                            │
                                            ▼
@archon/workflows                @archon/core
├── dag-executor.ts              orchestrator-agent.ts
│   ├── main path:921             ├── handleStreamMode:1142
│   │   throws unconditionally    │   sends classifyAndFormatError("success")
│   │   on isError                │
│   └── loop branch:1947          └── handleBatchMode:1298
│       same throw                    same surface

After

@archon/providers
└── claude/provider.ts
    [~] result handler: normalises is_error + subtype: 'success'
        ──▶ MessageChunk { /* no isError when subtype === 'success' */ }
            [+] debug log: claude.result_success_stop_sequence
                                            │
                                            ▼
@archon/workflows                @archon/core
├── dag-executor.ts              orchestrator-agent.ts
│   ├── main path:921             ├── handleStreamMode:1142
│   │   [~] && errorSubtype       │   transparently fixed (chunk clean)
│   │       !== 'success'         │
│   └── loop branch:1947          └── handleBatchMode:1298
│       [~] same carve-out            transparently fixed (chunk clean)

Connection inventory:

From	To	Status	Notes
`claude/provider.ts` result handler	`MessageChunk.isError`	modified	No longer emits `isError: true` when `subtype === 'success'`
`claude/provider.ts` result handler	logger	modified	New debug-level `claude.result_success_stop_sequence` event
`dag-executor.ts:907-922` (main)	`throw new Error(...)`	modified	New `errorSubtype !== 'success'` carve-out
`dag-executor.ts:1932-1948` (loop)	`throw new Error(...)`	modified	Same carve-out mirrored
`orchestrator-agent.ts:1127, 1298`	`MessageChunk`	unchanged	Transparently fixed via provider normalisation
Existing `error_max_budget_usd` carve-out at `dag-executor.ts:894-903`	—	unchanged	Pattern mirrored for new carve-out

Label Snapshot

Risk: risk: low
Size: size: S
Scope: workflows, core (tests)
Module: providers:claude, workflows:executor, core:orchestrator (test)

Change Metadata

Change type: bug
Primary scope: multi (providers + workflows + core tests)

Linked Issue

Closes bug(workflow): two silent-failure paths in DAG executor — Codex turn.failed exits stream, Claude success-with-stop-sequence thrown as failure #1425
Related: fix: interactive loop resume crashes with error_during_execution (stale session) #1208 (parent fix this regression rode on top of), bug(codex): retry loop reuses caller AbortSignal; crash on attempt N poisons attempt N+1 #1266 (separate Codex AbortSignal bug — not addressed), feat(workflows): add execution-evidence contract with terminal-success gate #1601 (broader evidence-contract PR — orthogonal)
Depends on: none
Supersedes: none

Validation Evidence (required)

bun run validate   # ✅ exit 0

All six gates green on first invocation (after one prettier auto-format pass on the new provider test):

check:bundled — bundled-defaults.generated.ts is up to date (36 commands, 20 workflows)
check:bundled-skill — bundled-skill.ts is up to date (21 files)
type-check — all 10 packages exit 0
lint — 0 errors, 0 warnings
format:check — all files Prettier-clean
test — all per-package suites pass; new tests:
- @archon/providers claude/provider.test.ts: +1 test (success-via-stop-sequence) → 26 → 27 in sendQuery decomposition behaviors
- @archon/workflows dag-executor.test.ts: +2 tests (DAG main + loop branch) → 218 in the test file
- @archon/core orchestrator-agent.test.ts: +1 test (direct chat contract) → 116 → 117

Skipped commands: none.

Security Impact (required)

New permissions/capabilities? No — only changes the conditional emission of an existing field on an existing chunk type, and a log level for one event.
New external network calls? No.
Secrets/tokens handling changed? No.
File system access scope changed? No.
Risk and mitigation: n/a.

Compatibility / Migration

Backward compatible? Yes — workflows that previously failed with "SDK returned success" will now complete cleanly. No workflow YAML changes required. Workflows that produced real errors (e.g. error_max_turns, error_during_execution, error_max_budget_usd) continue to fail loudly; the guard is subtype-scoped to 'success' only.
Config/env changes? No.
Database migration needed? No.
Upgrade steps: none.

Human Verification (required)

Verified scenarios:
- bun run validate exits 0 on a clean checkout of this branch (after one prettier auto-format on the new test).
- Individual per-package test runs: provider 74 pass, workflows dag-executor 218 pass, core orchestrator-agent 117 pass.
- SDK type signature personally inspected at @anthropic-ai/claude-agent-sdk@0.2.121/sdk.d.ts:3117-3158 — SDKResultSuccess.is_error: boolean (not literal false), confirming the bug mechanism described in the investigation artifact.
- Provider's else if (resultMsg.is_error === true && resultMsg.subtype === 'success') branch verified to fire on the new test case (asserted via not.toHaveBeenCalledWith(..., 'claude.result_is_error')).
Edge cases checked:
- Real error subtype (error_max_turns) still propagates isError: true and fires the claude.result_is_error error log — guarded by the existing test at provider.test.ts:1223 which continues to pass.
- Codex provider's turn.failed path (codex/provider.ts:235-247) audited — emits errorSubtype: 'codex_turn_failed', never 'success'. Unaffected.
- Pi provider's event-bridge.ts:160,164,173 audited — emits 'missing_assistant_message' or last.stopReason ∈ {'error','aborted'}, never 'success'. Unaffected.
- Empty-output guard at dag-executor.ts:1126 audited — still triggers if any node produces zero output AND no structured output. Defence-in-depth for Bug A (issue bug(workflow): two silent-failure paths in DAG executor — Codex turn.failed exits stream, Claude success-with-stop-sequence thrown as failure #1425) remains intact.
What was not verified:
- End-to-end live run against archon-fix-github-issue on a real Claude output_format: json_schema node. The unit tests synthesise the exact (is_error, subtype, stop_reason) triple the SDK emits in this case, so behavioural coverage is equivalent.
- Behaviour on Claude Agent SDK versions older than 0.2.121. The SDKResultSuccess.is_error: boolean type signature has been stable across the 0.2.x line (verified in 0.2.74 and 0.2.89 lockfile entries).

Side Effects / Blast Radius (required)

Affected subsystems:
- @archon/providers — Claude provider only. Codex/Pi unaffected.
- @archon/workflows — DAG executor's main and loop branches only.
- @archon/core — orchestrator direct chat transparently fixed; no code change to orchestrator-agent.ts.
Potential unintended effects:
- Stop_sequence-terminated runs that previously failed loudly with "SDK returned success" will now complete. This is the intended behaviour, but operators who monitored those failures as alerts will see them disappear. Logs preserve a debug-level claude.result_success_stop_sequence event for ops dashboards.
- The claude.result_is_error error-level log no longer fires for subtype === 'success'. The new debug event is a clean replacement.
Guardrails / monitoring for early detection:
- claude.result_success_stop_sequence log (debug) shows where the new code path fires.
- claude.result_is_error log (error) continues to fire for genuine error subtypes — alert on this if a real regression slips through.
- Type-check + lint + format catch any author misuse of the guard pattern at change time.

Rollback Plan (required)

Fast rollback path: git revert <merge-commit-sha> && git push origin dev — single commit, additive only. Reverting restores the pre-PR throw behaviour on stop_sequence terminations (re-introducing bug(workflow): two silent-failure paths in DAG executor — Codex turn.failed exits stream, Claude success-with-stop-sequence thrown as failure #1425's bug for affected users).
Feature flags / config toggles: None. The fix is unconditional and non-configurable — there's no scenario in which surfacing subtype === 'success' as a failure is correct.
Observable failure symptoms post-merge:
- If a user reports a workflow that completes when it should fail: check claude.result_is_error logs — should fire for real errors.
- If a user reports the original bug(workflow): two silent-failure paths in DAG executor — Codex turn.failed exits stream, Claude success-with-stop-sequence thrown as failure #1425 symptom returning: check the new && msg.errorSubtype !== 'success' guards in dag-executor.ts lines ~915 and ~1940 are still present; check claude/provider.ts:807-826 isRealError derivation is intact.

Risks and Mitigations

Risk: The provider's success normalisation hides a future SDK regression where subtype: 'success' actually does indicate a failure.
- Mitigation: Very unlikely given the SDK's explicit discriminated union (SDKResultMessage = SDKResultSuccess | SDKResultError) and the error_* subtype enum. If the SDK ever changes semantics, the executor's defense-in-depth guard would still emit a less-confusing error (the node would complete rather than throw with a contradictory message). The guard scope is subtype-equality on the string literal 'success', not a broader pattern.
Risk: A third-party IAgentProvider implementation copies the SDK pattern and starts emitting isError: true, errorSubtype: 'success'.
- Mitigation: Defense-in-depth guards at dag-executor.ts:915 and :1940 already cover this — the executor will not throw on such a chunk regardless of which provider emitted it.

Investigation artifact: .claude/PRPs/issues/issue-1425.md (local-only, gitignored per repo policy)

Summary by CodeRabbit

Bug Fixes
- Claude stop-sequence terminations that indicate clean completion are no longer misclassified as errors; sessions persist and no false error messages are shown.
Tests
- Added regression tests across provider, orchestrator, and workflow layers to prevent reintroduction of the misclassification and ensure workflows complete correctly.

…uccess (#1425) The Claude Agent SDK's SDKResultSuccess type declares is_error as boolean (not literal false). When a model terminates via a configured stop sequence the SDK sets is_error: true while keeping subtype: 'success' — its encoding of "non-default termination, but not a failure". The claude provider forwarded is_error verbatim into MessageChunk.isError, so three downstream consumers (dag-executor main path, dag-executor loop branch, orchestrator-agent direct chat) misclassified clean stop_sequence terminations as node failures and produced the contradictory user-hostile error "Node '<X>' failed: SDK returned success" — even though the AI had already completed correctly and written its output. Multi-user impact on v0.3.9 / v0.3.11 across review-classify and synthesis pipelines. Changes: - claude/provider.ts: normalise is_error + subtype: 'success' as a clean result. Don't propagate isError downstream; log a debug-level claude.result_success_stop_sequence event for observability instead. - dag-executor.ts (main path + loop branch): defense-in-depth guard so third-party IAgentProvider implementations that forward the SDK pair raw can't reintroduce the same false-failure. - provider/dag-executor/orchestrator-agent tests: regression tests covering the stop_sequence success path; guard test ensures real error subtypes (error_max_turns) still propagate. Fixes #1425

Wirasm · 2026-05-13T07:46:30Z

🔍 Automated Code Review

Independent review by prp-core:code-reviewer (didn't see the implementation conversation).

Result: PASS — Ready for merge

Findings

✅ Strengths

Provider boundary is the right layer. is_error: true + subtype: 'success' is translated to a clean MessageChunk before it leaves the provider. SDK-specific quirks belong here, not at three downstream consumers.
Tight predicate. isRealError = is_error === true && subtype !== 'success' is exact string-equality, not a loose pattern. A future subtype: 'success_partial' or any other value still routes to the error path.
Defense-in-depth guards are consistent. Both dag-executor.ts carve-outs (~915 main, ~1940 loop) use the same msg.isError && msg.errorSubtype !== 'success' predicate and sit correctly after the existing error_max_budget_usd carve-out at ~894.
Tests exercise all three layers. The orchestrator test is not vacuous — it runs the full handleMessage path including classifyAndFormatError, so it validates the end-to-end non-regression for direct chat.

⚠️ Suggestions (non-blocking)

One reasonable missing scenario: a third-party provider that emits isError: true, errorSubtype: 'success' with content in msg.errors would have those errors silently swallowed by the executor guard. This is an intentional design choice (acknowledged by the comment), and such a chunk is contractually malformed per the SDK's union — but worth keeping in mind if a future provider drifts.

🔒 Security

No security concerns identified. No new permissions, network calls, secrets handling, or filesystem scope. The change only modifies conditional emission of an existing chunk field and downgrades one log level.

Checklist

Fix addresses root cause from investigation
Code follows codebase patterns (mirrors error_max_budget_usd carve-out)
Tests cover the change at all three call sites + a guard test for real errors
CLAUDE.md compliance — typed imports, no any, Pino {domain}.{action}_{state} event naming
bun run validate exit 0 (six gates: check:bundled, check:bundled-skill, type-check, lint, format:check, test)

Self-reviewed by Claude • Ready for human review

Wirasm · 2026-05-13T07:49:10Z

PR Review Summary (multi-agent)

Five specialist agents ran against the diff: code-reviewer, docs-impact, pr-test-analyzer, comment-analyzer, silent-failure-hunter. Aggregated findings below.

Critical Issues

None.

Important Issues

Agent	Issue	Location
silent-failure-hunter	`orchestrator-agent.ts` was not updated with the same `errorSubtype !== 'success'` carve-out that was added to `dag-executor.ts`. PR description acknowledges this as deliberate ("transparently fixed via provider normalisation") since `ClaudeProvider` now strips `isError` before yielding. The gap matters only if a third-party `IAgentProvider` forwards the raw SDK pair — same scenario the dag-executor guards already defend against. Inconsistent defense-in-depth posture between the two consumers.	`packages/core/src/orchestrator/orchestrator-agent.ts:1127`, `:1298`
docs-impact	Missing `CHANGELOG.md` entry under `[Unreleased]` → `### Fixed`. User-observable bug fix.	`CHANGELOG.md`
comment-analyzer	Three `(#1425)` issue-number references in production comments. Per project CLAUDE.md, issue refs in inline comments rot once closed. (Pre-existing `(#1208)` refs compound this.)	`packages/providers/src/claude/provider.ts:805`, `packages/workflows/src/dag-executor.ts:911`, `:1939`
comment-analyzer	Loop guard comment leans on "see the corresponding guard in the main node path above" — the main path is ~1025 lines away. Should restate the SDK invariant inline like the main-path comment does.	`packages/workflows/src/dag-executor.ts:1938`

Suggestions

Agent	Suggestion	Location
pr-test-analyzer	Optional regression test (criticality 5/10): loop with `fresh_context: true`, `max_iterations: 2`, second iteration yields `is_error: true, subtype: 'success'`. Same guard code path, so completeness-only.	`packages/workflows/src/dag-executor.test.ts`

Strengths

Provider normalization is tight — isRealError = is_error === true && subtype !== 'success' is exact-equality, no loose match. Real error subtypes (error_max_turns, error_during_execution, error_max_budget_usd) continue to propagate correctly.
dag-executor guards are exact — both use errorSubtype !== 'success' string-equality.
Comment at provider normalization site is exemplary: explains the SDK type defect (is_error: boolean, not literal false), the trigger, the SDK's encoding convention, and the downstream consequence. Exactly the "why" comment the codebase should aspire to.
Behavioral test coverage — all four new tests assert on observable side-effects (no node_failed events, completeWorkflowRun called, no error message sent to user, claude.result_is_error NOT logged), not just "no throw".
Empty-output guard at dag-executor.ts:1131 still fires for stop_sequence-success nodes that produce no content — correct fail-loudly behavior preserved.
Log level is appropriate — debug for claude.result_success_stop_sequence, not info.

Verdict

NEEDS MINOR FIXES — no blockers. Recommend addressing the two Important items before merge: (1) the orchestrator-agent defense-in-depth gap (or explicitly document why it was scoped out) and (2) the missing CHANGELOG entry. The (#1425) comment refs and loop comment cross-reference are quality nits worth fixing while touching the file but not merge-blocking.

Recommended Actions

Apply the same && msg.errorSubtype !== 'success' carve-out at orchestrator-agent.ts:1127 and :1298 for consistency with the dag-executor guards, OR keep current behavior and add a brief comment explaining why the orchestrator deliberately relies on provider-level normalization alone.
Add a ### Fixed entry under [Unreleased] in CHANGELOG.md.
Drop the three (#1425) issue refs from production comments.
Make the loop guard comment self-contained.

coderabbitai · 2026-05-13T07:52:36Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e9714dd7-fee0-4d77-8580-7ef870212340

📥 Commits

Reviewing files that changed from the base of the PR and between 2af935c and b8af40f.

📒 Files selected for processing (6)

CHANGELOG.md
packages/core/src/orchestrator/orchestrator-agent.test.ts
packages/core/src/orchestrator/orchestrator-agent.ts
packages/providers/src/claude/provider.test.ts
packages/providers/src/claude/provider.ts
packages/workflows/src/dag-executor.ts

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

🚧 Files skipped from review as they are similar to previous changes (1)

packages/providers/src/claude/provider.ts

📝 Walkthrough

Walkthrough

Fixes Claude SDK result misclassification where SDK emits is_error:true with subtype:'success' and stop_reason:'stop_sequence', normalizing it to a non-error at the provider boundary and adding executor and orchestrator guards plus regression tests and a changelog entry.

Changes

SDK Success-Marker Classification and Propagation

Layer / File(s)	Summary
Claude Provider Error Classification `packages/providers/src/claude/provider.ts`, `packages/providers/src/claude/provider.test.ts`	Provider's `streamClaudeMessages` computes `isRealError` = `is_error === true` AND `subtype !== 'success'`, debug logs the clean stop_sequence case, and emits result chunks omitting `isError`/`errorSubtype`/`errors` when `isRealError` is false. Test validates shape and logging.
DAG Executor Node and Loop Error Handling `packages/workflows/src/dag-executor.ts`, `packages/workflows/src/dag-executor.test.ts`	`executeNodeInternal` and `executeLoopNode` only treat `msg.isError` as failure when `msg.errorSubtype !== 'success'`. Regression tests ensure no `node_failed` or `loop_iteration_failed` events and that workflow completion proceeds.
Orchestrator End-to-End Behavior `packages/core/src/orchestrator/orchestrator-agent.ts`, `packages/core/src/orchestrator/orchestrator-agent.test.ts`, `CHANGELOG.md`	Orchestrator-agent narrows error handling to `msg.isError && msg.errorSubtype !== 'success'`. Adds two tests confirming no user-visible error and session ID persistence. CHANGELOG updated with a fix note.

Sequence Diagram

sequenceDiagram
  participant Client
  participant Orchestrator
  participant Provider
  participant ClaudeSDK
  Client->>Orchestrator: request workflow execution
  Orchestrator->>Provider: sendQuery
  Provider->>ClaudeSDK: stream request
  ClaudeSDK-->>Provider: result is_error:true subtype:success stop_reason:stop_sequence
  Provider->>Provider: compute isRealError
  Provider-->>Orchestrator: emit clean result stopReason:stop_sequence no isError
  Orchestrator->>Client: report completion no user-visible error

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

bug(workflow): two silent-failure paths in DAG executor — Codex turn.failed exits stream, Claude success-with-stop-sequence thrown as failure #1425: Addresses the Claude SDK false-failure behavior where is_error === true with errorSubtype: 'success' and stop_reason: 'stop_sequence' should be treated as a clean termination.

Possibly related PRs

coleam00/Archon#1291: Related changes to DAG executor and provider classification for Claude SDK result handling.
coleam00/Archon#1089: Also touches orchestrator handling of msg.isError result chunks; may overlap in orchestrator behavior changes.
coleam00/Archon#1162: Related provider adjustments around Claude result event normalization.

Poem

🐰 I hopped through logs and tests today,
Claude said stop but feared dismay,
is_error true yet subtype sings success,
Provider and executor now handle the rest.
Sessions stay steady and users rest.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix(providers,workflows): treat Claude SDK stop_sequence success as success (`#1425`)' is fully related to the main changeset, clearly summarizing the primary fix of preventing stop-sequence terminations from being misclassified as failures.
Description check	✅ Passed	The PR description is comprehensive, covering all required template sections including summary, UX journey, architecture diagrams, validation evidence, security impact, compatibility, human verification, side effects, and rollback plan. All required content is present and detailed.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/1425-claude-success-stop-sequence

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

packages/providers/src/claude/provider.test.ts (1)

1249-1281: ⚡ Quick win

Assert the success-marker debug log in this regression test.

You already verify no error log; adding a mockLogger.debug assertion for the success-marker path will lock in the observability contract too.

Proposed test assertion

     expect(chunks[0]).not.toHaveProperty('isError');
     expect(chunks[0]).not.toHaveProperty('errorSubtype');
     expect(chunks[0]).not.toHaveProperty('errors');
     expect(mockLogger.error).not.toHaveBeenCalledWith(expect.anything(), 'claude.result_is_error');
+    expect(mockLogger.debug).toHaveBeenCalledWith(
+      expect.objectContaining({
+        sessionId: 'sid-stop-seq',
+        stopReason: 'stop_sequence',
+      }),
+      'claude.result_success_stop_sequence'
+    );
   });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/providers/src/claude/provider.test.ts` around lines 1249 - 1281, Add
an assertion that the success-marker debug log was emitted in this regression
test: after collecting chunks and asserting no error logs, assert that
mockLogger.debug was called with the success marker key (e.g.
expect(mockLogger.debug).toHaveBeenCalledWith(expect.anything(),
'claude.result_success_marker')); place this check alongside the existing
mockLogger.error assertion to lock in the observability contract for the
stop_sequence success path.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/providers/src/claude/provider.ts`:
- Around line 817-824: The log event name 'claude.result_success_stop_sequence'
should follow the `{domain}.{action}_{state}` convention with an allowed state
suffix; update the getLog().debug call in the block handling resultMsg (where
resultMsg.is_error === true && resultMsg.subtype === 'success') to use a
compliant state such as 'claude.result_success_validated' (retain the same
structured payload: sessionId and stopReason) so the event name aligns with the
pino logging standard; leave the surrounding logic in provider.ts and the
resultMsg checks unchanged.

---

Nitpick comments:
In `@packages/providers/src/claude/provider.test.ts`:
- Around line 1249-1281: Add an assertion that the success-marker debug log was
emitted in this regression test: after collecting chunks and asserting no error
logs, assert that mockLogger.debug was called with the success marker key (e.g.
expect(mockLogger.debug).toHaveBeenCalledWith(expect.anything(),
'claude.result_success_marker')); place this check alongside the existing
mockLogger.error assertion to lock in the observability contract for the
stop_sequence success path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6e3a1284-23a4-487e-9a4e-08601fdbf509

📥 Commits

Reviewing files that changed from the base of the PR and between d9d2bc6 and 2af935c.

📒 Files selected for processing (5)

packages/core/src/orchestrator/orchestrator-agent.test.ts
packages/providers/src/claude/provider.test.ts
packages/providers/src/claude/provider.ts
packages/workflows/src/dag-executor.test.ts
packages/workflows/src/dag-executor.ts

- Apply errorSubtype !== 'success' guard at orchestrator-agent.ts handleStreamMode and handleBatchMode result branches (defense-in-depth, mirrors dag-executor). Without this, a third-party IAgentProvider that forwards the SDK pair raw would surface a spurious error on direct chat and drop conversation output. Adds matching regression test. - Rename log event claude.result_success_stop_sequence -> claude.result_success_validated per CodeRabbit; aligns with {domain}.{action}_{state} pino convention. - Lock the new debug log into the provider regression test. - Drop (#1425) issue refs from production comments (rot risk per CLAUDE.md). - Rewrite loop-branch guard comment to be self-contained instead of cross- referencing the main-path guard 1000+ lines away. - Add CHANGELOG entry under [Unreleased] -> Fixed.

Wirasm · 2026-05-13T10:45:16Z

Review feedback addressed (commit `b8af40f`)

Addressed all Important items from the multi-agent review and the actionable CodeRabbit findings.

Fixed

Source	Item	Resolution
silent-failure-hunter	`orchestrator-agent.ts:1127` and `:1298` defense-in-depth gap	Applied the `&& msg.errorSubtype !== 'success'` carve-out at both sites with a comment explaining the third-party-provider scenario. Added a matching regression test that yields the raw SDK pair (no provider normalisation) and asserts the orchestrator does not surface an error to the user.
CodeRabbit (Major)	Log event `claude.result_success_stop_sequence` doesn't follow `{domain}.{action}_{state}`	Renamed to `claude.result_success_validated`.
CodeRabbit (Nitpick)	Provider regression test doesn't lock in the debug log	Added `expect(mockLogger.debug).toHaveBeenCalledWith(...)` assertion alongside the existing `mockLogger.error` check.
comment-analyzer	Three `(#1425)` issue refs in production comments	Removed from `provider.ts:805`, `dag-executor.ts:911`, `dag-executor.ts:1939`.
comment-analyzer	Loop guard comment cross-references main-path guard 1000+ lines away	Rewrote comment to restate the SDK invariant inline.
docs-impact	Missing CHANGELOG entry	Added `### Fixed` entry under `[Unreleased]` describing the user-observable behaviour change.

Skipped (with rationale)

pr-test-analyzer's optional fresh_context: true multi-iteration loop test (rated 5/10): same guard code path as the existing loop test, so completeness-only. The agent itself flagged it as "low-risk in practice" and not a likely regression vector.
Pre-existing (#1208) refs in the same comment blocks: out of scope for this PR — touching them would balloon the diff for unrelated cleanup.

Validation

bun run validate   # ✅ exit 0

All six gates green:

check:bundled / check:bundled-skill
type-check (10 packages)
lint (0 errors, 0 warnings)
format:check
test — including the two new regression tests:
- @archon/providers claude/provider.test.ts: debug log assertion
- @archon/core orchestrator-agent.test.ts: defense-in-depth (raw SDK pair)

…uccess (coleam00#1425) (coleam00#1662) * fix(providers,workflows): treat Claude SDK stop_sequence success as success (coleam00#1425) The Claude Agent SDK's SDKResultSuccess type declares is_error as boolean (not literal false). When a model terminates via a configured stop sequence the SDK sets is_error: true while keeping subtype: 'success' — its encoding of "non-default termination, but not a failure". The claude provider forwarded is_error verbatim into MessageChunk.isError, so three downstream consumers (dag-executor main path, dag-executor loop branch, orchestrator-agent direct chat) misclassified clean stop_sequence terminations as node failures and produced the contradictory user-hostile error "Node '<X>' failed: SDK returned success" — even though the AI had already completed correctly and written its output. Multi-user impact on v0.3.9 / v0.3.11 across review-classify and synthesis pipelines. Changes: - claude/provider.ts: normalise is_error + subtype: 'success' as a clean result. Don't propagate isError downstream; log a debug-level claude.result_success_stop_sequence event for observability instead. - dag-executor.ts (main path + loop branch): defense-in-depth guard so third-party IAgentProvider implementations that forward the SDK pair raw can't reintroduce the same false-failure. - provider/dag-executor/orchestrator-agent tests: regression tests covering the stop_sequence success path; guard test ensures real error subtypes (error_max_turns) still propagate. Fixes coleam00#1425 * fix(orchestrator,providers,workflows): address review feedback for coleam00#1425 - Apply errorSubtype !== 'success' guard at orchestrator-agent.ts handleStreamMode and handleBatchMode result branches (defense-in-depth, mirrors dag-executor). Without this, a third-party IAgentProvider that forwards the SDK pair raw would surface a spurious error on direct chat and drop conversation output. Adds matching regression test. - Rename log event claude.result_success_stop_sequence -> claude.result_success_validated per CodeRabbit; aligns with {domain}.{action}_{state} pino convention. - Lock the new debug log into the provider regression test. - Drop (coleam00#1425) issue refs from production comments (rot risk per CLAUDE.md). - Rewrite loop-branch guard comment to be self-contained instead of cross- referencing the main-path guard 1000+ lines away. - Add CHANGELOG entry under [Unreleased] -> Fixed.

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

Comment thread packages/providers/src/claude/provider.ts

Wirasm merged commit 936134f into dev May 13, 2026
4 checks passed

Wirasm deleted the fix/1425-claude-success-stop-sequence branch May 13, 2026 11:10

coleam00 mentioned this pull request May 14, 2026

Release 0.3.12 #1680

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(providers,workflows): treat Claude SDK stop_sequence success as success (#1425)#1662

fix(providers,workflows): treat Claude SDK stop_sequence success as success (#1425)#1662
Wirasm merged 2 commits into
devfrom
fix/1425-claude-success-stop-sequence

Wirasm commented May 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Wirasm commented May 13, 2026

Uh oh!

Wirasm commented May 13, 2026

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Wirasm commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Wirasm commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

UX Journey

Before

After

Architecture Diagram

Before

After

Label Snapshot

Change Metadata

Linked Issue

Validation Evidence (required)

Security Impact (required)

Compatibility / Migration

Human Verification (required)

Side Effects / Blast Radius (required)

Rollback Plan (required)

Risks and Mitigations

Summary by CodeRabbit

Uh oh!

Wirasm commented May 13, 2026

🔍 Automated Code Review

Result: PASS — Ready for merge

Findings

✅ Strengths

⚠️ Suggestions (non-blocking)

🔒 Security

Checklist

Uh oh!

Wirasm commented May 13, 2026

PR Review Summary (multi-agent)

Critical Issues

Important Issues

Suggestions

Strengths

Verdict

Recommended Actions

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Wirasm commented May 13, 2026

Review feedback addressed (commit b8af40f)

Fixed

Skipped (with rationale)

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Wirasm commented May 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Review feedback addressed (commit `b8af40f`)