Skip to content

bug(web): Reject + Resume buttons dead-end silently for non-web-parent runs #1522

@blankse

Description

@blankse

Bug Description

When a workflow run was started from the CLI (or any non-web platform) and is then rejected/resumed via the Web UI, the action returns success but nothing executes. There is no visible indication to the user that they must manually invoke archon workflow resume <id> from a terminal — the run just sits in failed status with metadata.rejection_reason populated.

This is distinct from #1131 (which fixed parent_conversation_id = NULL for web-started runs) and #1429 (auto-resume bypassing the gate). It also isn't the same surface area as the polish/refactor follow-ups in #1350 — that issue treats the cross-adapter guard as intentional, which it is. The bug here is a missing user-facing exit ramp when that intentional guard fires.

Steps to Reproduce

  1. Start a workflow with an approval node + on_reject from the CLI:
    archon workflow run my-workflow-with-approval-gate "..."
  2. Workflow pauses at the approval gate (correct).
  3. Open the Web UI dashboard, find the paused run, click Reject, type a reason, confirm.
  4. Web UI optimistically refreshes; the run is now failed.
  5. Expected: either the on_reject prompt runs, or the UI clearly tells the user how to resume.
  6. Actual: silence. The user has to dig into the DB or guess that they need archon workflow resume <id>.

Root Cause

`packages/server/src/routes/api.ts`, `tryAutoResumeAfterGate` (currently around line 1058):

```ts
if (!run.parent_conversation_id) return false;

if (parentConv.platform_type !== 'web') return false;
```

For CLI-started runs `parent_conversation_id` is NULL by design, so the function exits at the first guard. `rejectWorkflowRunRoute` then returns:

```ts
message: autoResumed
? `Workflow rejected: ${run.workflow_name}. Running on-reject prompt.`
: `Workflow rejected: ${run.workflow_name}. On-reject prompt will run on resume.`
```

The non-resumed message (`"… will run on resume"`) is technically correct but doesn't tell the user how to actually trigger that resume. The Web UI's reject handler (`DashboardPage.tsx:300`) only surfaces `setActionError` on throw — success messages are dropped, so even the current vague hint never reaches the user.

The Resume API endpoint (`api.ts:1899`) has the same shape: it just verifies the status is resumable and returns `"Re-run the workflow to auto-resume from completed nodes."`, which is a non-actionable string for someone on the web dashboard.

Concrete Evidence

Real run hit during local development (CLI start, web reject):

```
id: 0b3bdf0a5765bebccfe5694f3673c8c6
workflow_name: df-implement-with-preview-fast
status: failed
parent_conversation_id: NULL
metadata.rejection_reason: "…"
metadata.rejection_count: 1
```

Events: `approval_requested` → `approval_received {decision: rejected}` → nothing. No `node_started` for the synthetic `approval-gate:on_reject` node, no further `workflow_started`. `archon workflow resume ` from a terminal then unblocks it correctly.

Why This Matters

A reasonable user expectation when clicking Reject on a paused run with `on_reject` configured is that the workflow either iterates or tells you why it can't. Today the path silently dead-ends, and the same is true for the Resume button on any `failed` run. The cross-adapter guard exists for good reason (the API server isn't wired to dispatch back into a CLI/Slack/Telegram parent), but the consequence of that guard isn't communicated.

Proposed Fix (messaging-only, minimal)

  1. `tryAutoResumeAfterGate` returns a structured result instead of a plain boolean:
    ```ts
    { resumed: true } | { resumed: false; reason: 'no_parent' | 'non_web_parent' | 'dispatch_failed' }
    ```
  2. `rejectWorkflowRunRoute` (and `approveWorkflowRunRoute` for symmetry) construct a message that includes the explicit next-step command when `resumed === false`:
    ```
    Workflow rejected: ${name}. Auto-resume not available for this run (started outside the web UI).
    Run `archon workflow resume ${runId}` from the working directory to apply the on_reject prompt.
    ```
  3. `resumeWorkflowRunRoute` returns the same explicit command form.
  4. Tests covering all three guard branches in `tryAutoResumeAfterGate` (no parent, non-web parent, dispatch throws), each asserting the response message includes the run ID and `archon workflow resume` literal.

Optional Follow-up (Web UI surfacing — separate PR if appropriate)

Currently `DashboardPage.tsx` only renders `actionError` on throws. Either:

  • (a) Surface success-path messages via a toast/banner, or
  • (b) Render an inline "Resume manually" indicator on run cards whose `status === 'failed' && metadata.rejection_reason` is set.

Happy to scope this into the same PR if preferred, or land the API-only piece first and follow up.

Environment

  • Archon v0.3.10
  • Platform: WSL2 / Ubuntu, SQLite database
  • Repro reliably with any CLI-started workflow that has `approval` + `on_reject`

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium priority - Backlog, when time permitsarea: webWeb UI (packages/web) - React frontendbugSomething is broken

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions