Skip to content

fix(orchestrator): parse split workflow invocations#1542

Open
guanghuang wants to merge 1 commit into
coleam00:devfrom
guanghuang:fix/pi-split-invoke-workflow-dispatch
Open

fix(orchestrator): parse split workflow invocations#1542
guanghuang wants to merge 1 commit into
coleam00:devfrom
guanghuang:fix/pi-split-invoke-workflow-dispatch

Conversation

@guanghuang

@guanghuang guanghuang commented May 2, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: Some providers can emit /invoke-workflow across multiple assistant chunks, so Archon detects the command prefix before the full command body has arrived.
  • Why it matters: The web orchestrator can acknowledge a workflow dispatch in chat while never actually running the workflow.
  • What changed: The orchestrator now keeps collecting command chunks after detection, suppresses command text from user-visible output, preserves the latest parseable command snapshot, and parses raw batch output instead of formatted prose.
  • What did not change (scope boundary): This does not change workflow routing semantics, workflow execution, provider APIs, command syntax, or any Web UI behavior outside orchestrator command parsing.

UX Journey

Before

User                   Archon Web Chat                AI Provider
────                   ───────────────                ───────────
sends request ──────▶  sends prompt to provider ────▶ streams response
                       detects "/invoke-workflow"
                       stops collecting command body
sees assistant text ◀  may render acknowledgement
                       workflow dispatch parse fails or is incomplete
                       workflow does not run

After

User                   Archon Web Chat                      AI Provider
────                   ───────────────                      ───────────
sends request ──────▶  sends prompt to provider ─────────▶ streams response
                       detects "/invoke-workflow"
                       [continues collecting command chunks]
                       [suppresses command text from chat output]
                       [parses latest valid command snapshot]
sees dispatch state ◀  invokes requested workflow

Architecture Diagram

Before

packages/core/src/orchestrator/orchestrator-agent.ts
  ├─ handleStreamMode()
  │   └─ command prefix detection stopped useful accumulation too early
  └─ handleBatchMode()
      └─ parsed formatted assistant output, including non-command text

packages/core/src/orchestrator/orchestrator.test.ts
  └─ covered normal command dispatch, but not split command bodies

After

[~] packages/core/src/orchestrator/orchestrator-agent.ts
  ├─ handleStreamMode()
  │   === accumulates command response after prefix detection
  │   === tracks latest parseable command snapshot
  │   === suppresses command chunks from UI text
  └─ handleBatchMode()
      === parses raw concatenated assistant content for commands

[~] packages/core/src/orchestrator/orchestrator.test.ts
  └─ adds regression coverage for split stream and batch command dispatch

Connection inventory:

From To Status Notes
AI provider stream chunks handleStreamMode() modified Command chunks continue accumulating after prefix detection.
AI provider batch output handleBatchMode() modified Raw assistant text is used for command parsing.
Orchestrator command parser Workflow dispatch path unchanged Existing command parser and dispatch contract are preserved.
Orchestrator tests Split command behavior new Adds regression tests for stream and batch modes.

Label Snapshot

  • Risk: risk: low
  • Size: size: S
  • Scope: core|tests
  • Module: core:orchestrator

Change Metadata

  • Change type: bug
  • Primary scope: core

Linked Issue

Validation Evidence (required)

Commands and result summary:

bun test packages/core/src/orchestrator/orchestrator.test.ts
# Result: 70 pass, 0 fail

bun test packages/core/src/orchestrator/orchestrator-agent.test.ts
# Result: 100 pass, 0 fail
  • Evidence provided (test/log/trace/screenshot): Targeted unit test results for orchestrator dispatch and agent behavior.
  • If any command is intentionally skipped, explain why: Full validation was not run for this narrow fix. Running both orchestrator test files in one Bun invocation exposed pre-existing mock-state leakage in unrelated logger assertion tests; both files pass independently.

Security Impact (required)

  • New permissions/capabilities? (No)
  • New external network calls? (No)
  • Secrets/tokens handling changed? (No)
  • File system access scope changed? (No)
  • If any Yes, describe risk and mitigation: Not applicable.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Database migration needed? (No)
  • If yes, exact upgrade steps: Not applicable.

Human Verification (required)

What was personally validated beyond CI:

  • Verified scenarios: Split /invoke-workflow command chunks are parsed in stream mode; split command text is parsed in batch mode; trailing non-command text does not corrupt the preserved command snapshot.
  • Edge cases checked: Command detection before complete command body, command text suppression from user-visible output, batch parsing from raw assistant text.
  • What was not verified: Browser-level manual workflow launch through the Web UI was not re-run for this PR body update.

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: Core orchestrator command dispatch for /invoke-workflow and /register-project detection paths.
  • Potential unintended effects: Assistant messages containing command-looking text could be suppressed once command dispatch is detected.
  • Guardrails/monitoring for early detection: Existing command dispatch tests plus new split-command regressions should catch parsing regressions.

Rollback Plan (required)

  • Fast rollback command/path: Revert this PR commit.
  • Feature flags or config toggles (if any): None.
  • Observable failure symptoms: Workflow dispatch no longer starts from AI-generated /invoke-workflow output, or command text appears unexpectedly in chat output.

Risks and Mitigations

  • Risk: Providers may emit additional natural language after a command.
    • Mitigation: The implementation preserves the latest parseable command snapshot so trailing prose does not corrupt dispatch.
  • Risk: Command chunks could leak into chat output.
    • Mitigation: Command chunks are suppressed once command detection begins.

Summary by CodeRabbit

  • Tests
    • Added test coverage for stream and batch mode scenarios to ensure workflow isolation validation executes correctly while preventing unintended message propagation during command processing.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 2, 2026

Copy link
Copy Markdown

Caution

Review failed

An error occurred during the review process. Please try again later.

📝 Walkthrough

Walkthrough

This PR adds test coverage for stream-mode and batch-mode orchestration. When /invoke-workflow commands arrive across multiple assistant chunks, the tests verify that isolation validation runs while the command text is suppressed from the UI.

Changes

Workflow Command Split-Chunk Handling Tests

Layer / File(s) Summary
Stream and batch mode isolation validation for split /invoke-workflow commands
packages/core/src/orchestrator/orchestrator.test.ts
Stream-mode and batch-mode tests verify that when /invoke-workflow is detected before the command body arrives in subsequent chunks, mockValidateAndResolveIsolation is called and platform.sendMessage never receives messages containing /invoke-workflow.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • coleam00/Archon#1581: Both PRs add/adjust orchestrator /invoke-workflow multi-chunk stream/batch tests to ensure the UI/platform output containing /invoke-workflow is suppressed while isolation validation still runs.

Poem

🐰 A rabbit hops through chunks so small,
Where workflows split across the call,
No /invoke-workflow leaks to sight,
But isolation runs just right! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(orchestrator): parse split workflow invocations' accurately summarizes the main change: handling orchestrator commands split across multiple streamed chunks.
Linked Issues check ✅ Passed The PR successfully addresses all requirements from issue #1541: accumulates assistant text after command detection, suppresses command text from UI, preserves command snapshot to prevent corruption, uses raw text for batch parsing, and adds regression tests for split chunks in both modes.
Out of Scope Changes check ✅ Passed All changes are scoped to orchestrator command parsing logic in stream/batch modes and associated regression tests; no unrelated modifications detected outside issue #1541 requirements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering all required template sections with clear explanations of the problem, UX journey, architecture changes, validation evidence, security impact, compatibility, and rollback plan.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/core/src/orchestrator/orchestrator.test.ts (1)

945-947: ⚡ Quick win

Assert against any leaked /invoke-workflow text, not just the three exact chunks.

These expectations still pass if a future refactor leaks the command as one recombined message. Mirroring the batch-mode style here would make this regression guard much tighter.

Suggested assertion
-      expect(platform.sendMessage).not.toHaveBeenCalledWith('chat-456', '/invoke-workflow ');
-      expect(platform.sendMessage).not.toHaveBeenCalledWith('chat-456', 'fix-bug ');
-      expect(platform.sendMessage).not.toHaveBeenCalledWith('chat-456', '--project test-project');
+      expect(
+        platform.sendMessage.mock.calls.some(
+          ([id, content]) =>
+            id === 'chat-456' &&
+            typeof content === 'string' &&
+            content.includes('/invoke-workflow')
+        )
+      ).toBe(false);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/src/orchestrator/orchestrator.test.ts` around lines 945 - 947,
Replace the three exact-call assertions with a single check that no call to
platform.sendMessage contains the substring "/invoke-workflow" (to prevent any
leaked command even if split/recombined); locate the mock platform.sendMessage
usage in orchestrator.test.ts and assert by scanning
platform.sendMessage.mock.calls (or map over its arguments) and expect none of
the message strings to include "/invoke-workflow" instead of using
not.toHaveBeenCalledWith on the three specific chunks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/core/src/orchestrator/orchestrator.test.ts`:
- Around line 945-947: Replace the three exact-call assertions with a single
check that no call to platform.sendMessage contains the substring
"/invoke-workflow" (to prevent any leaked command even if split/recombined);
locate the mock platform.sendMessage usage in orchestrator.test.ts and assert by
scanning platform.sendMessage.mock.calls (or map over its arguments) and expect
none of the message strings to include "/invoke-workflow" instead of using
not.toHaveBeenCalledWith on the three specific chunks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c1a137ec-7cfd-497d-86d9-6f3ee2340bda

📥 Commits

Reviewing files that changed from the base of the PR and between 69b2c89 and d51d594.

📒 Files selected for processing (2)
  • packages/core/src/orchestrator/orchestrator-agent.ts
  • packages/core/src/orchestrator/orchestrator.test.ts

Comment thread packages/core/src/orchestrator/orchestrator-agent.ts Outdated
@Wirasm

Wirasm commented May 4, 2026

Copy link
Copy Markdown
Collaborator

@guanghuang related to #1541 — split invocation parsing.

@b1skit

b1skit commented May 5, 2026

Copy link
Copy Markdown
Contributor

@guanghuang @Wirasm @worldomkar

This fix addresses a real bug, but is incomplete. Note:

/register-project has the same vulnerability.
The title and observed behavior focus on /invoke-workflow, but the identical chunk-boundary failure applies to /register-project — the command name, project name, and path can each arrive in separate chunks. Any fix should cover both commands.

There are two distinct failure modes, not one.
The issue describes a single bug (accumulation stops too early), but naively continuing to accumulate indefinitely after detection introduces a second bug: if a post-command chunk arrives without a leading whitespace boundary, it concatenates directly onto the last captured token. For example, if test-project is followed immediately by This should not appear, the regex captures test-projectThis as the project name and the codebase lookup fails silently. Accumulation needs to stop once all required tokens are present — not just when the command prefix fires.

The batch-mode fix has two independent parts.
Fixing accumulation alone is not sufficient for batch mode. filterToolIndicators joins chunks with '\n\n---\n\n' before splitting on '\n\n', which injects bare separator lines between tokens. If the command spans chunks, the name and --project flag end up on separate lines and the parse regex never matches. The fix is to parse assistantMessages.join('') (raw concatenation) for command detection, while keeping filterToolIndicators output as the user-visible message. These are separate concerns and should stay that way.

Verification approach.
A complete regression suite should cover: (1) command split across 3+ chunks in both stream and batch mode, (2) pre-command prose followed by the command, (3) a trailing non-command chunk arriving after the complete command is present, and (4) both /invoke-workflow and /register-project.

I have an alternative PR here that addresses all of these issue completely: #1581

@Wirasm

Wirasm commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Review Summary

Verdict: Minor fixes needed

This PR fixes a real split-command issue in the orchestrator: when providers emit /invoke-workflow before the full command body, the old code prevented subsequent chunks from being accumulated, breaking dispatch. The fix is to track commandResponseSnapshot at detection time and use it for parsing instead of filtered stream output. Two new tests cover the happy path. A couple of edge cases need attention before merge.

Blocking issues

  • packages/core/src/orchestrator/orchestrator-agent.ts:1034: If the AI emits a complete command followed by extra text on the same line, commandResponseSnapshot captures the whole line and parseOrchestratorCommands may fail to match due to trailing text disrupting the regex. Fix: Slice the snapshot at the match end (use INVOKE_WORKFLOW_FULL_RE match index + length), or add a log when snapshot is set but parsing returns null.

Suggested fixes

  • packages/core/src/orchestrator/orchestrator-agent.ts:953–960 and :1111–1118: When the command prefix is detected but parseOrchestratorCommands returns no match, commandResponseSnapshot is never set and the code silently falls back to filtered output. The user sees nothing while the partial command is discarded. Fix: Add a warn log in the else branch:

    } else {
      getLog().warn({ conversationId }, 'orchestrator_command_prefix_detected_but_parse_failed');
    }
  • packages/core/src/orchestrator/orchestrator-agent.ts:960-968 and :1112-1118: No test covers the error path where the command regex matches but parsing fails (e.g., unknown workflow or malformed --project) in the split-chunk scenario. Fix: Add tests for stream and batch mode with /invoke-workflow nonexistent-workflow split across chunks, verifying graceful failure.

Minor / nice-to-have

  • packages/core/src/orchestrator/orchestrator-agent.ts:955-957 and :1108: The multi-line comment explaining accumulation is slightly verbose. Trim to: "// Keep accumulating so providers that split "/invoke-workflow" mid-token still produce a parseable final command." Apply the same trim to the batch mode comment for consistency.

Compliments

  • The two new regression tests ('dispatches workflow when command body arrives after /invoke-workflow detection' in stream and batch mode) are well-structured and cover the exact scenario the PR fixes.
  • The comment at handleBatchMode:1201–1202 is excellent — correctly captures the non-obvious invariant that raw stream is needed for commands to avoid batch-mode separator corruption.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality.

@Wirasm

Wirasm commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Hi @guanghuang — checking in. @b1skit flagged that this fix addresses a real bug but is incomplete (see their comment). The PR is currently DIRTY. Two paths forward:

  1. Expand the scope to cover the additional cases @b1skit identified, then rebase
  2. Land this narrower fix as-is and open a follow-up issue for the broader scope

Either is fine — let me know which way you want to go. If I don't hear back in ~7 days I'll close as inactive.

@guanghuang guanghuang force-pushed the fix/pi-split-invoke-workflow-dispatch branch from d51d594 to 39f6255 Compare May 26, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pi provider split assistant chunks can prevent AI-generated /invoke-workflow from dispatching

3 participants