Skip to content

fix(web): allow deleting nodes from Workflow Builder (#971)#1113

Merged
Wirasm merged 4 commits into
coleam00:devfrom
medevs:fix/971-workflow-builder-delete
Apr 22, 2026
Merged

fix(web): allow deleting nodes from Workflow Builder (#971)#1113
Wirasm merged 4 commits into
coleam00:devfrom
medevs:fix/971-workflow-builder-delete

Conversation

@medevs

@medevs medevs commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: Nodes added to the Workflow Builder canvas cannot be deleted — Delete/Backspace keys do nothing after drag-and-drop, no right-click context menu exists, and the Delete Node button is either hidden below the viewport (Prompt/Command) or entirely absent (Bash nodes).
  • Why it matters: Bash nodes are impossible to remove through any UI mechanism once placed. Other node types require obscure scrolling to find the delete button.
  • What changed: Auto-select dropped nodes, added right-click context menu, moved Delete button to inspector header, added Backspace key support.
  • What did not change (scope boundary): No new dependencies, no changes to workflow execution, no backend changes. Purely frontend UX fix in @archon/web.

UX Journey

Before

User                        Workflow Builder             ReactFlow
────                        ────────────────             ─────────
drags Prompt node ────────▶ onDrop() fires
                            creates node
                            [X] never calls onNodeSelect()
                            selectedNodeId stays null
presses Delete ───────────▶ useBuilderKeyboard
                            [X] if (!selectedNodeId) return
                            (silent no-op)
right-clicks node ────────▶ [X] no onContextMenu handler
                            (no menu rendered)
opens inspector ──────────▶ NodeInspector renders
navigates to Advanced tab ▶ Delete button present
                            [X] button below viewport
                            (requires scroll to discover)

drags Bash node ──────────▶ onDrop() fires, creates node
opens inspector ──────────▶ NodeInspector renders
                            isBash = true
                            [X] Advanced tab hidden
                            Delete button absent from DOM
user cannot delete ◀─────── zero UI mechanism for deletion

After

User                        Workflow Builder             ReactFlow
────                        ────────────────             ─────────
drags any node ───────────▶ onDrop() fires
                            creates node
                            *calls onNodeSelect(id)*
                            selectedNodeId = new node
presses Delete/Backspace ─▶ useBuilderKeyboard
                            *selectedNodeId is set*
                            *node deleted* ✓
right-clicks node ────────▶ *handleNodeContextMenu*
                            *context menu rendered*
clicks "Delete node" ─────▶ *onNodeDelete(nodeId)*
                            *node deleted* ✓
opens inspector ──────────▶ NodeInspector renders
                            *Delete button in header*
                            *visible for ALL types* ✓
clicks header Delete ─────▶ *node deleted* ✓

Architecture Diagram

Before

WorkflowBuilder
├── WorkflowCanvas (onDrop creates node, no select)
│   └── ReactFlow (no onNodeContextMenu)
├── NodeInspector
│   └── AdvancedTab (Delete button here, hidden for bash)
└── useBuilderKeyboard (Delete key only, needs selectedNodeId)

After

WorkflowBuilder
├── WorkflowCanvas (onDrop creates node + [~] auto-selects)
│   ├── ReactFlow ([+] onNodeContextMenu)
│   └── [+] Context menu div (Delete node button)
├── NodeInspector
│   ├── [~] Header (Delete button moved here, all node types)
│   └── AdvancedTab ([-] Delete button removed)
└── useBuilderKeyboard ([~] Delete + Backspace keys)

Connection inventory:

From To Status Notes
WorkflowBuilder WorkflowCanvas modified New onNodeDelete prop
WorkflowCanvas ReactFlow modified Added onNodeContextMenu handler
WorkflowCanvas Context menu new Inline positioned div with delete action
WorkflowBuilder NodeInspector unchanged onDelete prop unchanged
NodeInspector AdvancedTab modified Removed onDelete prop from AdvancedTab
useBuilderKeyboard WorkflowBuilder modified Now handles Backspace alongside Delete

Label Snapshot

  • Risk: risk: low
  • Size: size: S
  • Scope: web
  • Module: web:WorkflowCanvas, web:NodeInspector, web:WorkflowBuilder, web:useBuilderKeyboard

Change Metadata

  • Change type: bug
  • Primary scope: web

Linked Issue

Validation Evidence (required)

bun run type-check  ✅ clean across all 9 packages
bun run lint        ✅ clean (--max-warnings 0)
bun run format:check ✅ all source files pass (only HANDOFF.md flagged — not part of PR)
bun run test        ✅ all tests pass across all 9 packages
  • Evidence provided: Full bun run validate output and manual browser testing
  • If any command is intentionally skipped, explain why: N/A — all passed

Security Impact (required)

  • New permissions/capabilities? No
  • New external network calls? No
  • Secrets/tokens handling changed? No
  • File system access scope changed? No

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Database migration needed? No

Human Verification (required)

What was personally validated beyond CI:

  • Verified scenarios:
    • Drag Prompt node → press Delete → node removed
    • Drag Prompt node → press Backspace → node removed
    • Drag Command node → right-click → "Delete node" → node removed
    • Drag Bash node → press Delete → node removed
    • Drag Bash node → open inspector → header Delete button visible → click → node removed
    • Right-click → Escape → menu dismissed, node remains
    • Right-click → click canvas → menu dismissed, node remains
    • Type in NodeInspector textarea → Backspace edits text (does NOT delete node)
    • Delete connected node → edges also removed
  • Edge cases checked:
    • Backspace in text fields does not trigger node deletion (isInputTarget guard)
    • Context menu positioned correctly and doesn't overflow viewport
    • Dark mode styling uses existing Tailwind tokens
  • What was not verified: Automated component tests (no Vitest/RTL setup exists in @archon/web)

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: Workflow Builder UI only
  • Potential unintended effects: None — changes are isolated to 4 frontend files, no backend or workflow engine impact
  • Guardrails/monitoring for early detection: Type-check + lint + existing unit tests all pass

Rollback Plan (required)

  • Fast rollback command/path: git revert <commit-sha> — single atomic commit
  • Feature flags or config toggles (if any): None
  • Observable failure symptoms: Node deletion not working in the Workflow Builder (same as current state before fix)

Risks and Mitigations

  • Risk: e.preventDefault() on Backspace could cause unexpected behavior if focus escapes a text input
    • Mitigation: isInputTarget() guard in useBuilderKeyboard checks for input, textarea, and [contenteditable] before the keydown handler fires
  • Risk: Inline context menu styling may look off in certain themes
    • Mitigation: Uses existing project Tailwind tokens (bg-surface-elevated, border-border, text-error) consistent with the rest of the UI

Summary by CodeRabbit

  • New Features

    • Right-click context menu on the workflow canvas to delete nodes.
    • Prominent Delete button moved to the inspector header (visible for all node types).
    • Newly created nodes are auto-selected for immediate editing.
  • Bug Fixes / Improvements

    • Backspace now works alongside Delete to remove selected nodes.
    • Removed duplicate Delete control from the Advanced tab.
  • Documentation

    • Workflow Builder docs updated to describe node deletion options.

Three independent gaps prevented users from deleting nodes added to the
Workflow Builder canvas: dropped nodes were never auto-selected so
keyboard shortcuts silently no-oped, no right-click context menu
existed, and the Delete Node button was buried in the Advanced tab
(hidden below the viewport for Prompt/Command, completely absent for
Bash since bash nodes have no Advanced tab).

Fixes coleam00#971.
@coderabbitai

coderabbitai Bot commented Apr 12, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Moved deletion UI out of the Advanced tab into a persistent inspector header, added a canvas right-click context menu with Delete action, auto-select newly created nodes on drop/quick-add, and made Backspace and Delete both trigger node deletion.

Changes

Cohort / File(s) Summary
Inspector UI
packages/web/src/components/workflows/NodeInspector.tsx
Removed Advanced-tab inline "Delete Node"; added persistent destructive "Delete" button in DagInspector header (always shown, aria-label="Delete node").
Builder deletion logic
packages/web/src/components/workflows/WorkflowBuilder.tsx
Introduced nodesRef/edgesRef and pushSnapshotLatest(); added handleNodeDeleteById(nodeId: string) to remove node + incident edges, clear selection, mark dirty; refactored existing delete calls to delegate to it.
Canvas interactions & context menu
packages/web/src/components/workflows/WorkflowCanvas.tsx
WorkflowCanvasProps now requires onNodeDelete(nodeId); added right-click context menu with "Delete node" action, onNodeContextMenu registration, dismissal on Escape/outside clicks, and auto-selects newly created nodes (drop & quick-add) while calling onPushSnapshot.
Keyboard shortcuts & exports
packages/web/src/hooks/useBuilderKeyboard.ts, packages/web/src/hooks/useBuilderKeyboard.test.ts
Exported BuilderKeyboardActions, isInputTarget, and new handleBuilderKeydown; isInputTarget recognizes ARIA roles (combobox,textbox,searchbox); both Delete and Backspace now invoke onDeleteSelected() and call preventDefault(); added tests covering new behaviors.
Docs & scripts
packages/docs-web/src/content/docs/adapters/web.md, packages/web/package.json
Documented node deletion UX and added src/hooks/ to the test script.

Sequence Diagram

sequenceDiagram
    participant User
    participant Canvas as WorkflowCanvas
    participant Builder as WorkflowBuilder
    participant Inspector as NodeInspector
    participant Keyboard as useBuilderKeyboard

    rect rgba(200, 150, 255, 0.5)
    Note over User,Canvas: Drag-and-Drop Node
    User->>Canvas: Drag node onto canvas
    Canvas->>Canvas: onDrop() creates node
    Canvas->>Canvas: onNodeSelect(id) — auto-select new node
    Canvas->>Inspector: open/refresh inspector for selected node
    end

    rect rgba(100, 200, 255, 0.5)
    Note over User,Keyboard: Delete via Keyboard
    User->>Keyboard: Press Delete or Backspace
    Keyboard->>Keyboard: preventDefault()
    Keyboard->>Builder: actions.onDeleteSelected()
    Builder->>Builder: handleNodeDeleteById(selectedId)
    Builder->>Canvas: remove node & edges, clear selection
    end

    rect rgba(150, 200, 150, 0.5)
    Note over User,Canvas: Delete via Context Menu
    User->>Canvas: Right-click node
    Canvas->>Canvas: onNodeContextMenu() opens menu, auto-select node
    User->>Canvas: Click "Delete node"
    Canvas->>Builder: onNodeDelete(nodeId)
    Builder->>Builder: handleNodeDeleteById(nodeId)
    Builder->>Canvas: remove node & edges, clear selection
    end

    rect rgba(200, 200, 100, 0.5)
    Note over User,Inspector: Delete via Header Button
    User->>Inspector: Click "Delete" in DagInspector header
    Inspector->>Builder: onDelete()
    Builder->>Builder: handleNodeDelete() → handleNodeDeleteById()
    Builder->>Canvas: remove node & edges, clear selection
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 I hopped through nodes both new and old,
A right-click menu, a button bold.
Backspace joins Delete in the dance,
New nodes auto-selected—give them a chance!
Hooray for tidy DAGs, carrots in advance 🥕

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The PR addresses all coding requirements from issue #971 (auto-select, context menu, Delete button visibility, Backspace support) but the pr_objectives note that critical review feedback (C1: stale undo snapshot, C2: context menu viewport overflow, I1: ARIA role violations, I3: isInputTarget improvements) was partially addressed in a follow-up commit. The current state of fixes for these items is unclear from the provided context. Verify that critical issues C1 and C2 regarding stale undo snapshots and viewport-clamped menu positioning have been fully resolved in the current commit state, and confirm that ARIA and isInputTarget improvements align with the implemented fixes.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'fix(web): allow deleting nodes from Workflow Builder (#971)' clearly and concisely describes the main change — enabling node deletion in the Workflow Builder — and directly addresses the linked issue.
Description check ✅ Passed The PR description comprehensively covers all required template sections: problem statement, justification, detailed changes with UX journeys and architecture diagrams, validation evidence (type-check, lint, format, test), security assessment, compatibility notes, human verification with specific scenarios, side effects analysis, and rollback plan.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the node deletion feature: WorkflowBuilder/Canvas/Inspector keyboard/menu/button changes, useBuilderKeyboard enhancements, documentation updates, and test additions. No unrelated refactoring, dependency changes, backend modifications, or workflow execution logic changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/web/src/components/workflows/WorkflowCanvas.tsx`:
- Around line 175-181: The canvas-created node flows call setNodes(...) and then
onNodeSelect/onDirty but never call onPushSnapshot, so undo/redo won't capture
this addition; update both places that append a node (the handler using setNodes
and the other occurrence around lines 292-299) to invoke onPushSnapshot?.()
immediately before mutating state (i.e., before calling setNodes) so the new
node addition is pushed as a snapshot; keep the existing onNodeSelect(id) and
onDirty() calls after the setNodes call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b781b628-ca2b-4875-80e9-cc3581c0876f

📥 Commits

Reviewing files that changed from the base of the PR and between 536584d and 8a40e69.

📒 Files selected for processing (4)
  • packages/web/src/components/workflows/NodeInspector.tsx
  • packages/web/src/components/workflows/WorkflowBuilder.tsx
  • packages/web/src/components/workflows/WorkflowCanvas.tsx
  • packages/web/src/hooks/useBuilderKeyboard.ts

Comment thread packages/web/src/components/workflows/WorkflowCanvas.tsx Outdated
Call onPushSnapshot() before setNodes() in both onDrop and quick-add
handlers so that node additions are captured by undo/redo history.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Wirasm Wirasm closed this Apr 20, 2026
@Wirasm Wirasm reopened this Apr 20, 2026
@Wirasm

Wirasm commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

PR Review Summary — #1113 (fix/971-workflow-builder-delete)

Multi-agent review: code-reviewer, docs-impact, comment-analyzer, pr-test-analyzer, code-simplifier.

Verdict: NEEDS FIXES (minor — no blockers to the fix itself)

The UX fix is correct and all three new delete paths (auto-select, context menu, header button) work. The issues below are about undo-stack correctness under tight timing, minor a11y, viewport clamping, and CLAUDE.md comment-style compliance.


Critical Issues

ID Agent Issue Location
C1 code-reviewer handleNodeDeleteById captures nodes/edges in its dep array, so pushSnapshot({nodes, edges}) can snapshot pre-drop state if the user presses Delete/Backspace in the same tick as auto-select after a drop. Undo stack can record a snapshot that does not include the just-dropped node. Fix: hold nodes/edges in refs and drop them from the dep array. WorkflowBuilder.tsx:236-245
C2 code-reviewer Context menu uses position: fixed with raw e.clientX/e.clientY; no clamping. A right-click near the viewport edge renders the menu partially off-screen. Fix: clamp with Math.min(x, innerWidth - MENU_W) / Math.min(y, innerHeight - MENU_H). WorkflowCanvas.tsx:389-409

Important Issues

ID Agent Issue Location
I1 code-reviewer role="menu" + role="menuitem" without autoFocus, tabIndex, or arrow-key handling violates the ARIA menu contract. For a single-item menu the cleanest fix is to drop the roles entirely and rely on the already-accessible <button>. WorkflowCanvas.tsx:390-408
I2 code-reviewer onPushSnapshot={() => pushSnapshot({ nodes, edges })} is an inline closure (not memoized). Causes onDrop to reconstruct every render and introduces a narrow stale-closure window. Fix: useCallback or the ref pattern from C1. WorkflowBuilder.tsx:491-493
I3 code-reviewer isInputTarget() covers INPUT/TEXTAREA/SELECT/contentEditable but not role="combobox" / role="textbox" / role="searchbox". A future shadcn/Radix widget with focus could cause Backspace to delete a node while the user is editing. Extend the guard to check ARIA roles. useBuilderKeyboard.ts:104-109
I4 comment-analyzer CLAUDE.md explicitly says not to reference issue numbers in comments ("they rot"). Seven 971-references were added across the four files. NodeInspector.tsx:701, WorkflowBuilder.tsx:235, WorkflowCanvas.tsx:178/295/308/388, useBuilderKeyboard.ts:107
I5 comment-analyzer The handleNodeDeleteById block comment describes WHAT + lists callers; CLAUDE.md says to avoid WHAT comments. Callers are IDE-findable. WorkflowBuilder.tsx:233
I6 pr-test-analyzer useBuilderKeyboard.ts is a pure event handler already testable with bun test — no RTL needed. The Backspace guard has high regression potential (silent data loss if a future edit breaks the isInputTarget check). Rating: 8/10. One focused test is ~30 min of work. useBuilderKeyboard.ts

Suggestions

ID Agent Suggestion Location
S1 code-simplifier Collapse handleNodeDeleteById + handleNodeDelete into one callback with an id param. Callers all have access to selectedNodeId. Removes one useCallback, one dep array, and the block comment explaining the pair. WorkflowBuilder.tsx:232-246
S2 code-simplifier closeContextMenu = useCallback(() => setContextMenu(null), []) wraps a stable setter — no value, and it bloats the useEffect dep array. Call setContextMenu(null) directly. WorkflowCanvas.tsx (new code)
S3 code-simplifier Rename onPointeronClickOutside. "Pointer" in web API terms means PointerEvent; the handler takes MouseEvent. WorkflowCanvas.tsx
S4 docs-impact packages/docs-web/src/content/docs/adapters/web.md:170-177 — the Workflow Builder section doesn't mention the new delete affordances. The "keyboard shortcuts" bullet (line 174) currently implies only undo/redo. Add a "Delete node" bullet covering Delete/Backspace, the header button, and the right-click menu. docs-web

Strengths

  • { capture: true } on the dismissal listeners correctly beats ReactFlow's own bubble-phase handlers and any stopPropagation() calls (verified by comment-analyzer).
  • isInputTarget() guard for the existing Delete case already covered the common cases; extending to Backspace is the right call.
  • The onPushSnapshot?.() calls added in onDrop and handleQuickAddNode are a real correctness win — undo before add was missing before this PR.
  • Moving Delete to the inspector header (visible for bash which has no Advanced tab) is the correct UX resolution for the original bug.
  • No type leaks, no any, no new dependencies, no backend changes — scope matches the PR description exactly.

Recommended Next Steps

  1. Fix C1 (stale-closure snapshot) and C2 (viewport clamping) before merge.
  2. Remove the 7 issue-number references in comments per CLAUDE.md (I4).
  3. Consider dropping role="menu"/role="menuitem" (I1) — the simplest conformant path.
  4. Add a short useBuilderKeyboard test for the Delete/Backspace + isInputTarget invariant (I6).
  5. One-line docs addition in web.md (S4).
  6. I2, I3, I5, and the simplifications (S1–S3) are nice-to-have and can land in a follow-up.

medevs and others added 2 commits April 21, 2026 18:46
- Hold nodes/edges in refs so handleNodeDeleteById and onPushSnapshot
  can't capture stale pre-drop state (fixes undo-stack correctness).
- Clamp context-menu x/y to viewport so right-click near edges stays
  fully on-screen.
- Drop non-conformant role=menu/menuitem from the single-item context
  menu; rely on the native button for accessibility.
- Extend isInputTarget() to cover ARIA combobox/textbox/searchbox so
  Backspace in Radix/shadcn widgets never nukes a node.
- Extract handleBuilderKeydown as a pure function and add tests
  covering the Delete/Backspace + isInputTarget invariant.
- Remove issue-number references from code comments per CLAUDE.md.
- Document the new delete affordances in the Workflow Builder docs.
- Inline context-menu dismissal, rename pointer handler, drop unused
  deps in keyboardActions useMemo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (5)
packages/web/src/components/workflows/WorkflowBuilder.tsx (2)

400-404: Redundant double-guard on selectedNodeId.

onDeleteSelected guards on selectedNodeId and then calls handleNodeDelete, which performs the same guard. You can simplify by calling handleNodeDelete directly (it already no-ops when nothing is selected) — this was also flagged as suggestion S1 in the review summary.

Proposed simplification
-      onDeleteSelected: (): void => {
-        if (selectedNodeId) {
-          handleNodeDelete();
-        }
-      },
+      onDeleteSelected: handleNodeDelete,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/components/workflows/WorkflowBuilder.tsx` around lines 400 -
404, The onDeleteSelected handler redundantly checks selectedNodeId before
calling handleNodeDelete which already no-ops when nothing is selected; remove
the outer guard so onDeleteSelected simply calls handleNodeDelete directly
(update the onDeleteSelected function to call handleNodeDelete unconditionally
and remove references to selectedNodeId inside that handler).

175-186: Ref-sync pattern: consider syncing during render for concurrent-safety.

Syncing refs in a useEffect is generally fine here because pushSnapshotLatest is invoked from user-driven event handlers that run after commit. However, under React 19 concurrent rendering, a render can be started and discarded before the effect runs, which means any callback that reads nodesRef.current between a committed state update and the effect could momentarily see a stale value.

For strictly latest-state reads in callbacks, a common pattern is to assign during render:

Alternative ref-sync pattern
-  const nodesRef = useRef(nodes);
-  const edgesRef = useRef(edges);
-  useEffect(() => {
-    nodesRef.current = nodes;
-    edgesRef.current = edges;
-  }, [nodes, edges]);
+  const nodesRef = useRef(nodes);
+  const edgesRef = useRef(edges);
+  // Sync during render so event handlers fired between render and effect
+  // still observe the latest committed state.
+  nodesRef.current = nodes;
+  edgesRef.current = edges;

Functionally equivalent for this PR's use case — raising as an optional refinement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/components/workflows/WorkflowBuilder.tsx` around lines 175 -
186, The current pattern updates nodesRef/edgesRef inside a useEffect which can
be stale under concurrent rendering; instead assign nodesRef.current = nodes and
edgesRef.current = edges during render (before defining pushSnapshotLatest) so
pushSnapshotLatest (which uses nodesRef/edgesRef) always reads the freshest
values; locate the refs named nodesRef and edgesRef and move the sync into
render (or at least perform the assignment immediately before the useCallback
for pushSnapshotLatest), and remove or keep the effect only if you still need it
for legacy timing.
packages/web/src/hooks/useBuilderKeyboard.test.ts (1)

86-136: LGTM — focused tests addressing review feedback I6.

The delete invariant coverage is comprehensive: Delete/Backspace on canvas fires onDeleteSelected; INPUT/TEXTAREA/contentEditable/ARIA combobox/textbox all suppress; enabled=false suppresses shortcuts. Tests are deterministic and don't rely on mock.module.

Optional: consider also asserting e.preventDefault was called on the Delete/Backspace canvas cases to pin down that invariant, since downstream ReactFlow behavior depends on it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/hooks/useBuilderKeyboard.test.ts` around lines 86 - 136, Add
assertions that the synthetic event's preventDefault was called in the canvas
Delete/Backspace tests to ensure the handler suppresses native behavior: in the
tests that call handleBuilderKeydown(makeEvent('Delete'..., actions) and
makeEvent('Backspace'..., actions) where tagName is 'DIV', assert that the event
object's preventDefault was invoked (alongside the existing expect on
actions.calls.onDeleteSelected) by checking the mock event created by makeEvent;
reference the handleBuilderKeydown and makeEvent helpers and the
actions.calls.onDeleteSelected expectation when adding these preventsDefault
assertions.
packages/web/src/components/workflows/WorkflowCanvas.tsx (2)

304-317: Viewport clamping addresses C2; consider also clamping the lower bound.

The right/bottom clamping with approximate menu dimensions correctly prevents overflow in the common case. One optional hardening: if the viewport is narrower than CONTEXT_MENU_WIDTH (rare, but possible on very small windows/iframes), innerWidth - CONTEXT_MENU_WIDTH is negative and the menu is pushed off-screen to the left. Clamping the lower bound to 0 avoids that:

Proposed refinement
-      const x = Math.min(e.clientX, window.innerWidth - CONTEXT_MENU_WIDTH);
-      const y = Math.min(e.clientY, window.innerHeight - CONTEXT_MENU_HEIGHT);
+      const x = Math.max(0, Math.min(e.clientX, window.innerWidth - CONTEXT_MENU_WIDTH));
+      const y = Math.max(0, Math.min(e.clientY, window.innerHeight - CONTEXT_MENU_HEIGHT));

Also, CONTEXT_MENU_WIDTH/CONTEXT_MENU_HEIGHT are static values — hoisting them to module scope (alongside the existing resolveNodeLabel) avoids re-binding on each render.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/components/workflows/WorkflowCanvas.tsx` around lines 304 -
317, The context menu position logic in handleNodeContextMenu can produce
negative x/y when the viewport is smaller than CONTEXT_MENU_WIDTH/HEIGHT; change
the clamping to Math.max(0, Math.min(...)) so x = Math.max(0,
Math.min(e.clientX, window.innerWidth - CONTEXT_MENU_WIDTH)) and similarly for y
before calling setContextMenu({ x, y, nodeId: node.id }). Also hoist
CONTEXT_MENU_WIDTH and CONTEXT_MENU_HEIGHT to module scope (next to
resolveNodeLabel) so they are not re-bound on each render.

385-402: Dropping role="menu"/role="menuitem" resolves I1.

For a single-action popover, relying on native <button> semantics is more correct than asserting menu roles without full arrow-key navigation and focus management. If/when this grows to multiple items, either implement the full ARIA menu pattern (roving tabindex, arrow keys, aria-activedescendant) or reach for a primitive like @radix-ui/react-dropdown-menu.

One small accessibility gap remains: the menu doesn't auto-focus when opened, so keyboard-only users who right-click via Shift+F10 can't tab directly to Delete without leaving the canvas. Non-blocking for this PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/components/workflows/WorkflowCanvas.tsx` around lines 385 -
402, The context menu currently relies on a native button (good) but it doesn't
auto-focus when opened; add a ref (e.g., deleteButtonRef) to the Delete button
and, inside a useEffect that watches contextMenu, call
deleteButtonRef.current?.focus() when contextMenu is non-null so keyboard users
can immediately tab/activate the action; keep the existing contextMenuRef and
setContextMenu/onNodeDelete logic unchanged and do not introduce ARIA
menu/menuitem roles unless you implement full menu keyboard handling later.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@packages/web/src/components/workflows/WorkflowBuilder.tsx`:
- Around line 400-404: The onDeleteSelected handler redundantly checks
selectedNodeId before calling handleNodeDelete which already no-ops when nothing
is selected; remove the outer guard so onDeleteSelected simply calls
handleNodeDelete directly (update the onDeleteSelected function to call
handleNodeDelete unconditionally and remove references to selectedNodeId inside
that handler).
- Around line 175-186: The current pattern updates nodesRef/edgesRef inside a
useEffect which can be stale under concurrent rendering; instead assign
nodesRef.current = nodes and edgesRef.current = edges during render (before
defining pushSnapshotLatest) so pushSnapshotLatest (which uses
nodesRef/edgesRef) always reads the freshest values; locate the refs named
nodesRef and edgesRef and move the sync into render (or at least perform the
assignment immediately before the useCallback for pushSnapshotLatest), and
remove or keep the effect only if you still need it for legacy timing.

In `@packages/web/src/components/workflows/WorkflowCanvas.tsx`:
- Around line 304-317: The context menu position logic in handleNodeContextMenu
can produce negative x/y when the viewport is smaller than
CONTEXT_MENU_WIDTH/HEIGHT; change the clamping to Math.max(0, Math.min(...)) so
x = Math.max(0, Math.min(e.clientX, window.innerWidth - CONTEXT_MENU_WIDTH)) and
similarly for y before calling setContextMenu({ x, y, nodeId: node.id }). Also
hoist CONTEXT_MENU_WIDTH and CONTEXT_MENU_HEIGHT to module scope (next to
resolveNodeLabel) so they are not re-bound on each render.
- Around line 385-402: The context menu currently relies on a native button
(good) but it doesn't auto-focus when opened; add a ref (e.g., deleteButtonRef)
to the Delete button and, inside a useEffect that watches contextMenu, call
deleteButtonRef.current?.focus() when contextMenu is non-null so keyboard users
can immediately tab/activate the action; keep the existing contextMenuRef and
setContextMenu/onNodeDelete logic unchanged and do not introduce ARIA
menu/menuitem roles unless you implement full menu keyboard handling later.

In `@packages/web/src/hooks/useBuilderKeyboard.test.ts`:
- Around line 86-136: Add assertions that the synthetic event's preventDefault
was called in the canvas Delete/Backspace tests to ensure the handler suppresses
native behavior: in the tests that call
handleBuilderKeydown(makeEvent('Delete'..., actions) and
makeEvent('Backspace'..., actions) where tagName is 'DIV', assert that the event
object's preventDefault was invoked (alongside the existing expect on
actions.calls.onDeleteSelected) by checking the mock event created by makeEvent;
reference the handleBuilderKeydown and makeEvent helpers and the
actions.calls.onDeleteSelected expectation when adding these preventsDefault
assertions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: efb2655b-869a-4c6e-8bb6-ecf53072717c

📥 Commits

Reviewing files that changed from the base of the PR and between b9c75dc and fc67740.

📒 Files selected for processing (7)
  • packages/docs-web/src/content/docs/adapters/web.md
  • packages/web/package.json
  • packages/web/src/components/workflows/NodeInspector.tsx
  • packages/web/src/components/workflows/WorkflowBuilder.tsx
  • packages/web/src/components/workflows/WorkflowCanvas.tsx
  • packages/web/src/hooks/useBuilderKeyboard.test.ts
  • packages/web/src/hooks/useBuilderKeyboard.ts
✅ Files skipped from review due to trivial changes (2)
  • packages/web/package.json
  • packages/docs-web/src/content/docs/adapters/web.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/web/src/hooks/useBuilderKeyboard.ts

@Wirasm Wirasm merged commit d7f36b2 into coleam00:dev Apr 22, 2026
4 checks passed
prospapledge88 added a commit to prospapledge88/Archon that referenced this pull request May 5, 2026
* fix(workflows): fail loudly on SDK isError results (coleam00#1208) (coleam00#1291)

Previously, `dag-executor` only failed nodes/iterations when the SDK
returned an `error_max_budget_usd` result. Every other `isError: true`
subtype — including `error_during_execution` — was silently `break`ed
out of the stream with whatever partial output had accumulated, letting
failed runs masquerade as successful ones with empty output.

This is the most likely explanation for the "5-second crash" symptom in
coleam00#1208: iterations finish instantly with empty text, the loop keeps
going, and only the `claude.result_is_error` log tips the user off.

Changes:
- Capture the SDK's `errors: string[]` detail on result messages
  (previously discarded) and surface it through `MessageChunk.errors`.
- Log `errors`, `stopReason` alongside `errorSubtype` in
  `claude.result_is_error` so users can see what actually failed.
- Throw from both the general node path and the loop iteration path
  on any `isError: true` result, including the subtype and SDK errors
  detail in the thrown message.

Note: this does not implement auto-retry. See PR comments on coleam00#1121 and
the analysis on coleam00#1208 — a retry-with-fresh-session approach for loop
iterations is not obviously correct until we see what
`error_during_execution` actually carries in the reporter's env.
This change is the observability + fail-loud step that has to come
first so that signal is no longer silent.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
(cherry picked from commit 4c6ddd9)

* fix(db): throw on corrupt commands JSON instead of silent empty fallback (coleam00#1033)

* fix(db): throw on corrupt commands JSON instead of silent empty fallback (coleam00#967)

getCodebaseCommands() silently returned {} when the commands column
contained corrupt JSON. Callers had no way to distinguish 'no commands'
from 'unreadable data', violating fail-fast principles.

Now throws a descriptive error with the codebase ID and a recovery hint.
The error is still logged for observability before throwing.

Adds two test cases: corrupt JSON throws, valid JSON string parses.

* fix: include parse error in log for better diagnostics

(cherry picked from commit 39a05b7)

* fix(isolation): raise worktree git-operation timeout to 5m (coleam00#1306)

All 15 worktree git-subprocess timeouts in WorktreeProvider were hardcoded
at 30000ms. Repos with heavy post-checkout hooks (lint, dependency install,
submodule init) routinely exceed that budget and fail worktree creation.

Consolidate them onto a single GIT_OPERATION_TIMEOUT_MS constant at 5 min.
Generous enough to cover reported cases while still catching genuine hangs
(credential prompts in non-TTY, stalled fetches).

Chosen over the config-key approach in coleam00#1029 to avoid adding permanent
.archon/config.yaml surface for a problem a raised default solves cleanly.
If 5 min turns out to also be too tight for real-world use, we'll revisit.

Closes coleam00#1119
Supersedes coleam00#1029

Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com>
(cherry picked from commit cc78071)

* fix(web,server): show real platform connection status in Settings (coleam00#1061)

The Settings page's Platform Connections section hardcoded every platform
except Web to 'Not configured', so users couldn't tell whether their Slack/
Telegram/Discord/GitHub/Gitea/GitLab adapters had actually started.

- Server: /api/health now returns an activePlatforms array populated live
  as each adapter's start() resolves. Passed into registerApiRoutes so the
  reference stays mutable — Telegram starts after the HTTP listener is
  already accepting requests, so a snapshot would miss it.
- Web: SettingsPage.PlatformConnectionsSection now reads activePlatforms
  from /api/health and looks each platform up in a Set. Also adds Gitea
  and GitLab to the list (they already ship as adapters).

Closes coleam00#1031

Co-authored-by: Lior Franko <liorfr@dreamgroup.com>
(cherry picked from commit 08de8ee)

* fix: initialize options.hooks before merging YAML node hooks (coleam00#1177)

When a workflow node defines hooks (PreToolUse/PostToolUse) in YAML but
no hooks exist yet on the options object, applyNodeConfig crashes with
"undefined is not an object" because it tries to assign properties on
the undefined options.hooks.

Initialize options.hooks to {} before the merge loop.

Reproduces with: archon workflow run archon-architect (which uses
per-node hooks extensively).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
(cherry picked from commit 7ea3214)

* fix: detect completion signal in any XML tag, not just <promise> (coleam00#1126) (coleam00#1184)

* fix: detect completion signal in any XML tag, not just <promise> (coleam00#1126)

Loop nodes with `until:` reported max_iterations_reached when the AI wrapped
the completion signal in XML tags other than `<promise>` (e.g.,
`<COMPLETE>ALL_CLEAN</COMPLETE>`). The three existing regex patterns all missed
this format, causing the loop to exhaust iterations and fail.

Changes:
- Add generic XML-wrapped signal pattern to `detectCompletionSignal`
- Extend `stripCompletionTags` to strip matched XML-wrapped signals from output
- Pass `loop.until` to `stripCompletionTags` call site in dag-executor
- Add unit tests for detection and stripping of XML-wrapped signals
- Add integration test for loop completing on final iteration with XML tags

Fixes coleam00#1126

* fix: address review findings for completion signal detection

- Update detectCompletionSignal JSDoc to document all three detection formats
- Update stripCompletionTags JSDoc to mention the `until` parameter
- Remove superfluous `m` flag from xmlWrappedPattern (no anchors, no effect)
- Document that XML tag names are matched independently (intentional permissiveness)
- Add test: detects signal in mismatched XML tags (permissive behavior)
- Add test: strips both <promise> and XML-tagged signal in same chunk
- Add assertion in DAG integration test that raw XML tags don't appear in sent messages

* simplify: reduce complexity in changed files

* fix: require matching XML tag names in completion-signal detection

Follow-up to the initial broadening in this PR. The first version of the
regex accepted mismatched open/close tags (e.g. `<COMPLETE>X</done>`)
which was a small false-positive surface when the AI interleaves tags
in prose. Tightens both detectCompletionSignal and stripCompletionTags
to capture the tag name and enforce it on the close via \1
backreference. Case-insensitivity on the tag name is preserved.

Test updates:
- Flip the "permissive mismatch" case to assert strict rejection with a
  comment explaining the guard.
- Add a case-insensitive matching case to lock that behavior in.

No behavior change for workflows that use matching tags (the
overwhelming common case) or for <promise>...</promise>. Behavior change
is limited to the narrow "open tag and close tag disagree" case, which
only happens when the AI is confused — in which case we'd rather report
max_iterations_reached and let the author inspect than silently call
the loop complete.

(cherry picked from commit bc25dee)

* fix(web): allow deleting nodes from Workflow Builder (coleam00#971) (coleam00#1113)

* fix(web): allow deleting nodes from Workflow Builder (coleam00#971)

Three independent gaps prevented users from deleting nodes added to the
Workflow Builder canvas: dropped nodes were never auto-selected so
keyboard shortcuts silently no-oped, no right-click context menu
existed, and the Delete Node button was buried in the Advanced tab
(hidden below the viewport for Prompt/Command, completely absent for
Bash since bash nodes have no Advanced tab).

Fixes coleam00#971.

* fix(web): push undo snapshot before adding nodes on canvas

Call onPushSnapshot() before setNodes() in both onDrop and quick-add
handlers so that node additions are captured by undo/redo history.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(web): address PR coleam00#1113 review feedback

- Hold nodes/edges in refs so handleNodeDeleteById and onPushSnapshot
  can't capture stale pre-drop state (fixes undo-stack correctness).
- Clamp context-menu x/y to viewport so right-click near edges stays
  fully on-screen.
- Drop non-conformant role=menu/menuitem from the single-item context
  menu; rely on the native button for accessibility.
- Extend isInputTarget() to cover ARIA combobox/textbox/searchbox so
  Backspace in Radix/shadcn widgets never nukes a node.
- Extract handleBuilderKeydown as a pure function and add tests
  covering the Delete/Backspace + isInputTarget invariant.
- Remove issue-number references from code comments per CLAUDE.md.
- Document the new delete affordances in the Workflow Builder docs.
- Inline context-menu dismissal, rename pointer handler, drop unused
  deps in keyboardActions useMemo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
(cherry picked from commit d7f36b2)

* fix(workflows): make archon-adversarial-dev sed replacement macOS-safe (coleam00#1155)

* fix(workflows): make adversarial init sed portable on macOS

* chore: regenerate bundled-defaults after adversarial-dev sed fix

Sync generated bundle with the new temp-file sed pattern in
archon-adversarial-dev.yaml so check:bundled passes and binary
distributions ship the macOS-safe version.

---------

Co-authored-by: laplace young <yangqk12@whu.edu.cn>
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
(cherry picked from commit 817186d)

* fix(deps): override transitive axios to ^1.15.0 for CVE-2025-62718 (coleam00#1330)

axios <1.15.0 can be coerced to bypass NO_PROXY rules via hostname
normalization, enabling SSRF in the right network shape. Archon pulls
axios transitively through @slack/bolt (^1.12.0) and @slack/web-api
(^1.13.5); before this change bun.lock resolved axios@1.13.6 — within
the vulnerable range.

Adding "axios": "^1.15.0" to the root package.json overrides bumps the
transitive resolution to axios@1.15.1 (latest compatible 1.x). Both
Slack range specs accept it without API surface changes — no downstream
code touches axios directly.

Supersedes coleam00#1153. Credits @stefans71 for identifying and reporting the
vulnerability; their PR was stale on the lockfile (0.3.5 → 0.3.6 drift
on dev), so this is a fresh one-line re-do on current dev.

Closes coleam00#1053.

Co-authored-by: Stefans71 <stefans71@users.noreply.github.com>
(cherry picked from commit ae2d936)

* fix(cli): surface stale-workspace registration error instead of fake "not a git repo" (coleam00#1332)

* fix(cli): surface stale-workspace registration error instead of fake "not a git repo"

When workflowRunCommand auto-registers an unregistered repo, a stale
~/.archon/workspaces/<owner>/<repo>/source symlink (pointing to an old
checkout) causes createProjectSourceSymlink() in @archon/paths to throw:

  Source symlink at <linkPath> already points to <existing>, expected <target>

The CLI caught that in a try/catch, logged it at warn level, continued
with `codebase = null`, and then the isolation / resume branches hit
their "codebase missing" fallback and threw the generic:

  Cannot create worktree: not in a git repository.

That message is false — the repo is valid; the Archon workspace entry
is stale. It sends users down the wrong diagnostic path (checking git
config, permissions, etc.) instead of pointing at the workspace dir.

Fix: preserve the registration error on a new `codebaseRegistrationError`
local, and at both fallback sites (resume + worktree-creation) check it
before the generic "not a git repo" branch. When set, throw a truthful:

  Cannot {create worktree,resume}: repository registration failed.
  Error: <original message>
  Hint: Remove the stale workspace entry at <dir> and retry, or
        use --no-worktree to skip isolation.

The hint's exact path comes from a small parser that extracts the
workspace directory from the known "Source symlink at …" format; when
the message shape doesn't match (future error text changes), the parser
returns null and we fall back to a generic "check registration under
<archon-home>/workspaces" hint — safe degradation.

Regression test in workflow.test.ts asserts the new error message and
negatively asserts the old "not in a git repository" string is gone.

Supersedes coleam00#1157 — that PR was draft + CONFLICTING against current dev,
and also mentioned Windows test-compat changes that weren't in the diff
(pruned scope). This is a fresh re-do focused strictly on coleam00#1146.

Closes coleam00#1146.

Co-authored-by: Bortlesboat <Bortlesboat@users.noreply.github.com>

* review: add resume-path test, null-fallback test, update troubleshooting docs

Addresses multi-agent review feedback on this PR:

- Add regression test for the --resume fallback site (the worktree-create
  site was already covered; the resume site had identical wiring but zero
  test coverage).
- Add test for the unrecognized-error-shape branch of
  buildRegistrationFailureError so the generic workspace hint is pinned
  (prevents accidental inversion of the stale-entry vs generic-hint
  ternary).
- Update the troubleshooting page to key on the new
  "Cannot create worktree: repository registration failed." message.
  Users hitting the new error won't find the page under the old heading,
  and the "In the future..." note is obsolete now that the error itself
  contains the cleanup path.
- Trim both new docblocks: keep the load-bearing cross-package error
  string contract in extractStaleWorkspaceEntry, drop narration of what
  the code already shows. Drop the "Before this helper existed..."
  paragraph from buildRegistrationFailureError — that's CHANGELOG
  material. Drop PR-reference suffix from the test section divider.

* review: guard getArchonHome in hint + export parser for direct tests

Two follow-up fixes to the multi-agent review commit (f32f002):

CodeRabbit finding — unguarded getArchonHome() in the fallback hint.
If getArchonHome() ever throws (misconfigured env vars, permission issues
on the resolution path), the registration-failure Error would never get
constructed: we'd throw a secondary home-resolution error that masks the
root cause. Wrap the fallback branch in try/catch — prefer losing the
exact path in the hint over replacing the actionable registration error.
A safe generic hint ("Check your Archon workspace registration and retry")
takes over when getArchonHome() throws. The original error.message is
always embedded verbatim in the re-thrown Error.

S2 — export extractStaleWorkspaceEntry for direct table tests. The parser
is where the cross-package string contract with @archon/paths actually
lives; direct tests against it are cheaper than end-to-end CLI tests and
pin the edge cases:

- POSIX path with forward slashes (typical unix user)
- Windows path with backslashes (verifies Math.max(lastIndexOf / , lastIndexOf \))
- Unrelated error message (no prefix) → null
- Prefix matches but delimiter missing → null
- Source path without any separator → null (guards against returning
  empty string, which would produce a nonsense "Remove the stale
  workspace entry at " hint)
- Empty string → null

Six new cases in the test file. The claim of Windows support in the
PR description is now actually verified.

* fix(test): make generic-hint assertion path-separator agnostic

Windows test runner (CI) hit:
  Expected to contain: "Check your Archon workspace registration under /home/test/.archon/workspaces"
  Received: "... under \home\test\.archon\workspaces and retry, ..."

path.join normalizes to `\` on Windows and `/` on POSIX. The test hardcoded
forward slashes in the expected substring. Split into two separator-agnostic
asserts: the prefix up to "under", then `/workspaces\b/` regex for the final
path segment. Behavior doesn't change — the hint still gets the full
path.join'd workspaces dir on either platform.

---------

Co-authored-by: Bortlesboat <Bortlesboat@users.noreply.github.com>
(cherry picked from commit 056707d)

* fix(server,web,workflows): web approval gates auto-resume + reject-with-reason dialog (coleam00#1329)

* fix(server,web,workflows): web approval gates auto-resume + reject-with-reason dialog

Fixes three tightly-coupled bugs that made web approval gates unusable:

1. orchestrator-agent did not pass parentConversationId to executeWorkflow
   for any web-dispatched foreground / interactive / resumable run. Without
   that field, findResumableRunByParentConversation (the machinery the CLI
   relies on for resume) couldn't find the paused run from the same
   conversation on a follow-up message, and the approve/reject API handlers
   had no conversation to dispatch back to.

2. POST /api/workflows/runs/:runId/{approve,reject} recorded the decision
   and returned "Send a message to continue the workflow." — the workflow
   never actually resumed. Added tryAutoResumeAfterGate() that mirrors what
   workflowApproveCommand / workflowRejectCommand already do on the CLI:
   look up the parent conversation, dispatch `/workflow run <name>
   <userMessage>` back through dispatchToOrchestrator. Failures are
   non-fatal — the user can still send a manual message as a fallback.

3. The during-streaming cancel-check in dag-executor aborted any streaming
   node whenever the run status left 'running', including the legitimate
   transition to 'paused' that an approval node performs. A concurrent AI
   node in the same DAG layer now tolerates 'paused' and finishes its own
   stream; only truly terminal / unknown states (null, cancelled, failed,
   completed) abort the in-flight stream.

Web UI: ConfirmRunActionDialog gains an optional reasonInput prop (label +
placeholder) that renders a textarea and passes the trimmed value to
onConfirm. WorkflowRunCard (dashboard) and WorkflowProgressCard (chat)
both use it for Reject now — the chat card was still on window.confirm,
which was both inconsistent with the dashboard and couldn't collect a
reason. The trimmed reason threads through to $REJECTION_REASON in the
workflow's on_reject prompt.

Supersedes coleam00#1147. @jonasvanderhaegen surfaced the root cause and shape of
the fix; that PR was 87 commits stale and pre-dated the reject-UX upgrade
(coleam00#1261 area), so this is a fresh re-do on current dev.

Tests:
- packages/server/src/routes/api.workflow-runs.test.ts — 5 new cases:
  approve with parent dispatches; approve without parent returns "Send a
  message"; approve with deleted parent conversation skips safely; reject
  dispatches on-reject flows; reject that cancels (no on_reject) does NOT
  dispatch.
- packages/core/src/orchestrator/orchestrator.test.ts — updated the two
  synthesizedPrompt-dispatch tests for the new executeWorkflow arity.

Closes coleam00#1131.

Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>

* fix: address multi-agent review findings for web approval auto-resume

C1 (critical) — cross-adapter misrouting guard
  tryAutoResumeAfterGate now checks parentConv.platform_type === 'web'
  before dispatching. Non-web parents (Slack/Telegram/GitHub/Discord)
  being approved from the dashboard skip auto-resume rather than
  dispatching a Slack thread_ts or Telegram chat_id through the web
  adapter's lock manager.

C2 (critical) — fire-and-forget dispatch replaced with await
  void dispatchToOrchestrator() meant the "Resuming workflow." response
  fired before async work completed, and the outer try/catch couldn't
  observe dispatch failures. Changed to await; response now accurately
  reflects dispatch outcome.

I1 — replaced logPrefix string-template (which produced 3-segment
  api.workflow_*.dispatched event names violating {domain}.{action}_{state})
  with literal event names per action, branched inside the helper.
  Accepts action: 'approve' | 'reject' instead.

I2 — corrected misleading "foreground/interactive" qualifier in the
  approve-endpoint comment; background web dispatches also set
  parent_conversation_id via the pre-created run, so they auto-resume too.

I3 — extracted shouldContinueStreamingForStatus() as a small exported
  policy and added 7 unit tests covering running/paused/null/cancelled/
  failed/completed/unknown. Full-integration coverage of the paused-
  tolerance invariant would require manipulating the 10s
  CANCEL_CHECK_INTERVAL_MS, which is flaky-prone; unit test of the
  policy function captures the same invariant deterministically.

I4 — updated approval-nodes.md and authoring-workflows.md to reflect
  that Web UI approve/reject now auto-resumes (no "send a follow-up
  message" copy), documented the reject-with-reason dialog and
  $REJECTION_REASON flow, and called out the cross-platform caveat.

S1 — rewrote streaming status check as positive shouldContinue safe-list
  via the extracted policy function, matching the inline comment.

S2 — inlined handleReject on the dashboard rather than squeezing
  rejectWorkflowRun through runAction with a closure; keeps runAction
  narrow for the single-arg lifecycle actions.

S5 — new regression test covering the non-web-parent skip path
  (slack-platform parent → dispatch skipped → response falls back to
  "Send a message to continue").

S6 — removed stale reference to runAction in ConfirmRunActionDialog's
  onConfirm JSDoc (no longer accurate now that WorkflowProgressCard
  calls the dialog without runAction).

S7 — fixed misleading "user can resume manually by sending any message"
  docstring (resume is triggered by re-running the workflow command,
  not by an arbitrary message).

Skipped as out-of-scope:
  S3 — cancelWorkflowRun rowCount check (pre-existing defect; separate PR)
  S4 — tightening expect.anything() to UUID regex (deferred)
  S8 — 12-positional-arg executeWorkflow → options-bag refactor
    (tracked follow-up)

bun run validate green locally; 68 tests in api.workflow-runs.test.ts
(up from 67), 173 in dag-executor.test.ts (up from 166).

* review: close I1/I2/I3/I4/I6 — paused tolerance in loop + emitter, resume test, useId

I1 (loop inter-iteration check) — dag-executor.ts:1715
  Used `!== 'running'` in the loop node's between-iteration status check.
  A sibling approval node pausing the run in the same topological layer
  would abort the loop mid-iteration with "Loop node '<id>' stopped at
  iteration N (paused)". Switched to the shared shouldContinueStreamingForStatus
  helper so paused is tolerated — same semantics the streaming check got.
  Extended inline comment explains the sibling-layer concurrency reason.

I2 (skipIfStatusChanged emitter unregister) — dag-executor.ts:2886
  At DAG-finalization writes the helper correctly skipped writing on any
  non-running state (paused included — don't mark a paused run complete),
  but it *also* called getWorkflowEventEmitter().unregisterRun() which
  broke SSE observability for a run that's still live (waiting for user
  approval). Split the two responsibilities: skip the write for all
  non-running states, but only unregister the emitter for terminal states
  (cancelled / deleted / completed / failed). `paused` keeps the emitter
  registered so resume stays visible on the dashboard.

I3 (foreground_resume_detected branch untested) — orchestrator-agent.test.ts
  That branch was modified as part of the original fix (added
  parentConversationId as 11th positional arg) but no existing test
  configured mockFindResumableRunByParentConversation to return non-null.
  A positional mistake (e.g. accidentally swapping issueContext and
  parentConversationId) would silently break auto-resume with no failing
  test. New regression test configures the mock, asserts both the cwd
  comes from the resumable run's working_path AND parentConversationId
  is passed correctly at position 10.

I4 (null-parent log level) — api.ts tryAutoResumeAfterGate
  `getConversationById` returning null is a data-integrity signal (the
  parent conversation was deleted while the run was paused) — worth
  surfacing at info level so operators notice, not hiding at debug.
  Missing platform_conversation_id on an existing row would be an unusual
  DB state and stays at debug. Added `parentDeleted: boolean` to the log
  context so the two cases are distinguishable in observability.

I6 (hardcoded DOM id) — ConfirmRunActionDialog.tsx
  `id="confirm-run-action-reason"` collided when multiple dialog instances
  share the same page (Radix portals mitigate in practice but the code
  was fragile). Switched to React.useId() so each instance gets a unique
  id — htmlFor/id wiring preserved.

S11 (arity-only assertion) — orchestrator-agent.test.ts:1092 area
  The interactive-workflow-on-web test asserted mockExecuteWorkflow was
  called, but nothing about the args. Added a specific assertion that
  position 10 (parentConversationId) equals 'conv-1' (the caller
  conversation id) — pins the wiring that I1/I2 depend on being correct.

Deferred (from review S1-S10, I5, I7):
  - S1 (ExecuteWorkflowOptions bag) — tracked as standalone follow-up;
    12 positional args with 2 adjacent optionals is a real maintenance
    hazard but the refactor deserves its own PR.
  - S7 (WHY comment on non-web else branch) — review text says the branch
    "correctly omits" parentConversationId but the code passes it; the
    combination with the web-parent guard in tryAutoResumeAfterGate is
    intentional. Not adding a justify-what-we-don't-do comment.
  - S2/S3/S4/S5/S8/S9/S10 — pure polish (event-map ternary, platformConvId
    inlining, shared constant for REJECTION_REASON_INPUT, onChange arrow
    shorthand, discriminated union, docblock trim, suffix comment drop)
  - I5 (soften "Resuming workflow." to "— check the dashboard for progress")
    — users clicking from the dashboard are already on the dashboard; the
    current text is accurate (enqueue completed) and concise.
  - I7 (test dispatch-throws path) — covered implicitly by the try/catch
    branch of tryAutoResumeAfterGate returning false; a direct test would
    require mocking handleMessage to throw and would couple to
    dispatchToOrchestrator internals.

bun run validate green; 189 dag-executor tests, 98 orchestrator-agent
tests, 68 api.workflow-runs tests — all the new cases pass.

---------

Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
(cherry picked from commit d5c1cd9)

* feat(providers): autodetect canonical binary install paths for Claude and Codex (coleam00#1361)

Both binary resolvers previously stopped at env-var + explicit config and
threw a "not found" error when neither was set. Users who followed the
upstream-recommended install flow (Anthropic's `curl install.sh` for
Claude, `npm install -g @openai/codex`) still had to manually set either
`CLAUDE_BIN_PATH` / `CODEX_BIN_PATH` or the corresponding config field
before any workflow could run.

Add a tier-N autodetect step between the explicit config tier and the
install-instructions throw. Purely additive: env and config still win
when set (precedence covered by new tests). On autodetect miss, the same
install-instructions error fires as before.

Claude probe list (verified against docs.claude.com "Uninstall Claude
Code → Native installation" section):
  - $HOME/.local/bin/claude            (mac/linux native installer)
  - $USERPROFILE\.local\bin\claude.exe (Windows native installer)

Codex probe list (verified against openai/codex README; npm global-
install puts the binary at `{npm_prefix}/bin/<name>` on POSIX,
`{npm_prefix}\<name>.cmd` on Windows):
  - $HOME/.npm-global/bin/codex   (user-set `npm config set prefix`)
  - /opt/homebrew/bin/codex       (mac arm64 with homebrew-node)
  - /usr/local/bin/codex          (mac intel / linux system node)
  - %APPDATA%\npm\codex.cmd       (Windows npm global default)
  - $HOME\.npm-global\codex.cmd   (Windows user-set prefix)

Not probed (explicit override still required):
  - Custom npm prefixes — `npm root -g` would need a subprocess per
    resolve, too much surface for a probe helper
  - `brew install --cask codex` — cask layout isn't a PATH binary
  - Manual GitHub Releases extracts — placement is user-determined
  - `~/.bun/bin/codex` — not documented in openai/codex README

Pi provider intentionally has no equivalent change: the Pi SDK is
bundled into the archon binary (no subprocess), so there's no "binary"
to resolve. Pi auth lives at `~/.pi/agent/auth.json` which the SDK
already finds by default, and the PR A shim (`PI_PACKAGE_DIR`) handles
the package-dir case via Pi's own documented escape hatch.

E2E verified: removed both config entries from ~/.archon/config.yaml,
rebuilt compiled binary, ran `archon workflow run archon-assist` and a
Codex workflow. Logs showed `source: 'autodetect'` for both, responses
returned cleanly.

(cherry picked from commit b99cee4)

* fix(providers/test): use os.homedir() instead of $HOME in claude binary autodetect test

The native-installer autodetect test computed its expected path from
process.env.HOME, but the implementation uses node:os homedir(). On
Windows, HOME is typically unset (Windows uses USERPROFILE), so the
test fell back to '/Users/test' while the resolver returned the real
home dir — making the spy's path-equality check fail and breaking CI
on windows-latest.

Mirror the implementation by importing homedir() from node:os and
joining with node:path so the expected path matches the actual
platform-resolved home and separator.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
(cherry picked from commit f9f8775)

* fix(server): contain Discord login failure so it doesn't kill the server (coleam00#1365)

Reported in coleam00#1365: a user running `archon serve` with DISCORD_BOT_TOKEN
set but the "Message Content Intent" toggle disabled in the Discord
Developer Portal saw the entire server crash with `Used disallowed
intents`. Discord rejects the gateway connection (close code 4014) when
a privileged intent is requested without being enabled, and the
unguarded `await discord.start()` propagated the error all the way up,
taking the web UI down with it.

Wrap discord.start() in try/catch — log the failure with an actionable
hint (special-cased for the disallowed-intent error) and continue
running. Other adapters and the web UI come up regardless. The shutdown
handler already uses optional chaining (`discord?.stop()`) so nulling
discord after a failed start is safe.

Other adapters (Telegram, Slack, GitHub, Gitea, GitLab) have the same
unguarded-start pattern but are out of scope for this fix — addressing
them is tracked separately.

Also expanded the Discord setup docs with a caution callout that names
the exact error string and the new log event so users can grep for
both.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
(cherry picked from commit 5957c6e)

---------

Co-authored-by: Cole Medin <cole@dynamous.ai>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Kagura <kagura.chen28@gmail.com>
Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com>
Co-authored-by: Shay Elmualem <12733941+norbinsh@users.noreply.github.com>
Co-authored-by: Lior Franko <lior.franko@ironsrc.com>
Co-authored-by: Lior Franko <liorfr@dreamgroup.com>
Co-authored-by: Alex Siri <alexsiri7@gmail.com>
Co-authored-by: Ahmed <44034059+medevs@users.noreply.github.com>
Co-authored-by: CauchYoung <2024302072042@whu.edu.cn>
Co-authored-by: laplace young <yangqk12@whu.edu.cn>
Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
Co-authored-by: Stefans71 <stefans71@users.noreply.github.com>
Co-authored-by: Bortlesboat <Bortlesboat@users.noreply.github.com>
Co-authored-by: Jonas Vanderhaegen <7755555+jonasvanderhaegen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(web): Workflow Builder nodes cannot be deleted after drag-and-drop (no auto-select, no context menu, missing delete button for bash nodes)

2 participants