Skip to content

web UI: workflow graph view stuck on "Loading graph..." when workflow lives only in ~/.archon/workflows #1710

@fodurrr

Description

@fodurrr

Summary

  • What broke: The web UI's per-run graph view never loads when the workflow YAML is registered only in home scope (~/.archon/workflows/). The run executes correctly, the run page renders, the conversation streams — but the graph stays on a "Loading graph..." spinner indefinitely. Console shows GET /api/workflows/<name> returning 404, including the ?cwd=… variant the UI retries with.
  • When it started (if known): Surfaced after moving from per-repo project-scope workflows to a single shared set of symlinks in ~/.archon/workflows/. Before the move the same workflow rendered fine because a project-scope copy existed under <repo>/.archon/workflows/; after the project-scope copy was removed, the UI graph broke even though the CLI still resolves and runs the workflow without issue.
  • Severity: minor (functional impact is visual only; runs still complete; but it makes the graph view unusable for any deployment that registers workflows only in home scope)

Steps to Reproduce

  1. Install a workflow YAML only in home scope:
    ln -s /abs/path/to/source/workflows/my-workflow.yaml \
          ~/.archon/workflows/my-workflow.yaml
    Make sure no project-scope copy exists under any cwd's .archon/workflows/.
  2. From a working directory with no project-scope override, dispatch it:
    cd ~/some/repo
    archon workflow run my-workflow some-args
    The run starts and progresses through nodes normally.
  3. Open the run in the web UI: http://<host>:3090/workflows/runs/<run-id>.
  4. Observe: header, conversation, SSE stream all render. The graph panel sits on "Loading graph..." indefinitely. Hard refresh (Cmd/Ctrl+Shift+R) does not help.
  5. Open DevTools → Network. The page issues:
    • GET /api/workflows/my-workflow → 404
    • GET /api/workflows/my-workflow?cwd=/Users/<you>/some/repo → 404 (retried a few times)
    • GET /api/workflows (the list endpoint) → 200, and the response includes my-workflow under .workflows[].workflow.name.

So the workflow is listable but not gettable.

Expected vs Actual

  • Expected: GET /api/workflows/<name> resolves the workflow using the same scope-fallback order the CLI executor and the list endpoint use — project scope first, then ~/.archon/workflows/. A workflow that appears in the list endpoint should always be retrievable by name from the singular endpoint.
  • Actual: The singular endpoint only looks in project scope (cwd-relative). It does not fall back to home scope. Adding ?cwd=… doesn't change the result; the param is honoured but the home-scope fallback never runs. Any ?source=… / ?scope=… variant returns the same 404, so callers can't opt into home-scope lookup either.

User Flow

Operator                 archon serve (API)               ~/.archon/workflows/
────────                 ───────────────────              ────────────────────
                                                          my-workflow.yaml (symlink)

archon workflow run ───▶ CLI resolver: project? no.
  my-workflow             home?    yes ─────────────────▶ found ✓
                          executes run, returns run-id

opens /workflows/runs ──▶ GET /api/workflows/runs/<id>      200 ✓
                          GET /api/workflows               200 ✓ (list has my-workflow)
                          GET /api/workflows/my-workflow
                            ↳ project scope? no.
                            ↳ [X] no fallback to ~/.archon
                            ↳ 404

UI receives 404 ◀──────── 404 stays 404 even with ?cwd=…
graph spinner spins
forever

Environment

  • Platform: Web (graph view) + CLI (dispatch). Bug is in the Web API.
  • Database: SQLite (default Archon DB on the host running archon serve).
  • Running in worktree? Yes — archon workflow run creates its own worktree as usual; bug reproduces regardless of --no-worktree.
  • OS: macOS (Darwin), archon serve on the same host. Reproduced over a remote network and on localhost directly.

Logs

Browser console on the run page:

GET http://<host>:3090/api/workflows/my-workflow 404 (Not Found)        [×3]
GET http://<host>:3090/api/workflows/my-workflow?cwd=/Users/<you>/… 404 [×3]
[SSE] Connection error, reconnecting... [object Object]

curl from the same host:

$ curl -s http://localhost:3090/api/workflows | jq -r '.workflows[].workflow.name' | grep my-workflow
my-workflow                                  # listable ✓

$ for q in "" "?source=user" "?source=home" "?scope=user" "?scope=home"; do
    curl -s -o /dev/null -w "%s → %{http_code}\n" \
      "http://localhost:3090/api/workflows/my-workflow$q"
  done
 → 404
?source=user → 404
?source=home → 404
?scope=user → 404
?scope=home → 404                            # not gettable ✗

$ ls -la ~/.archon/workflows/my-workflow.yaml
lrwxr-xr-x  …  my-workflow.yaml -> /abs/path/to/source/workflows/my-workflow.yaml

CLI side (succeeds, for contrast):

$ cd ~/some/repo && archon workflow run my-workflow some-args
[node-a] Started
[node-a] Completed (5ms)
[node-b] Started
[node-b] Completed (3.6s)
…workflow proceeds through all nodes normally…

Impact

  • Affected workflows/commands: any workflow that lives only in ~/.archon/workflows/. Any deployment that installs workflows once at home scope rather than copying into every consumer repo loses the graph view for those workflows.
  • Reproduction rate: Always.
  • Workaround available? Two, both unattractive:
    1. Re-introduce a project-scope copy (<repo>/.archon/workflows/<name>.yaml) in every consumer repo. This negates the reason to use home-scope in the first place and means every consumer repo carries a copy of execution assets it doesn't logically own.
    2. Stop using the graph view and rely on CLI logs / SSE stream for run visibility.
  • Data loss risk? No. Runs execute normally; only visualisation is affected.

Scope

  • Package(s) likely involved: server, web, paths.
  • Module (if known): the handler for GET /api/workflows/:name in archon serve — likely short-circuits on a project-scope miss instead of running the same resolver used by GET /api/workflows (list) and by the CLI executor. A minimal fix is to swap the singular handler's lookup for the shared resolveWorkflowByName(name, cwd) (or equivalent) that already walks [project, home] in order. A secondary mitigation would be to teach the singular endpoint to honour an explicit ?scope=home query parameter so the UI can retry on a project-scope miss.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions