You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Clicking into a workflow run's logs from the dashboard / command center sometimes hangs forever on a Loading graph... spinner. The dashboard list works fine; the failure is on the detail page when the graph view tab is selected. No error is shown — the spinner just spins indefinitely until the page is reloaded.
Root Cause
packages/web/src/components/workflows/WorkflowExecution.tsx fetches the workflow definition (DAG topology) in a separate useQuery from the run data, and there is no error state for the graph panel when that fetch fails or returns nothing:
// ~line 241-246const{data: workflowDef}=useQuery({queryKey: ['workflowDefinition',workflowName,codebaseCwd],queryFn: ()=>getWorkflow(workflowName,codebaseCwd),staleTime: Infinity,// once errored, never retries});// ~line 247constdagDefinitionNodes=workflowDef?.workflow?.nodes??null;// ~line 502-516{dagDefinitionNodes ? (<WorkflowDagViewer.../>
) : (
<div>Loading graph...</div> // ← hangs here forever
)}
When that fetch fails (404, transient error, missing data), dagDefinitionNodes stays null, the ternary picks the "Loading graph..." branch, and there is no error indicator, no retry button, and no fallback to the runtime DAG nodes already available in initialData?.dagNodes from the events stream.
Likely Triggers (ranked)
Workflow definition endpoint can 404 for globally-installed workflows.GET /api/workflows/:name at packages/server/src/routes/api.ts:2164-2237 searches project .archon/workflows/ then bundled defaults — but does not check ~/.archon/.archon/workflows/. The LIST endpoint includes the home scope (via discoverWorkflowsWithConfig), so the run appears in the dashboard, but clicking through to the detail page returns 404 → graph view hangs. Closely related to the marketplace install gap I'm filing separately.
Polling crash with unguarded optional chain.refetchInterval callback at WorkflowExecution.tsx:162-166 accesses query.state.data?.workflowState.status (missing ?. before workflowState). A partial response missing workflowState would throw inside the callback. Same area as existing issue Workflow execution page can crash during polling #1598; PR fix(workflows): guard workflow execution polling state #1599 addresses the crash path but not the silent hang.
React Query staleTime: Infinity makes failures sticky. Even if the definition endpoint recovers (network blip), the query never retries. The page stays in the "Loading..." state until a full reload.
Proposed Fix
Render an error state in the graph panel when the workflowDef query is in error status — show the error message and a retry button.
Fall back to initialData?.dagNodes (runtime DAG from the events stream) when the definition fetch fails — runtime data is usually sufficient to draw the graph for an in-progress or completed run.
Add a home-scoped lookup (~/.archon/.archon/workflows/) to GET /api/workflows/:name in packages/server/src/routes/api.ts:2164-2237 so the detail endpoint matches the LIST endpoint's resolution rules.
Remove staleTime: Infinity (or pair it with explicit invalidation on error) so transient failures retry.
Repro Hint
If hard to repro in dev, try running a workflow whose YAML lives at ~/.archon/.archon/workflows/ (i.e., a globally-installed workflow). The run will appear in the dashboard but clicking through to the detail page will hang the graph view.
Summary
Clicking into a workflow run's logs from the dashboard / command center sometimes hangs forever on a
Loading graph...spinner. The dashboard list works fine; the failure is on the detail page when the graph view tab is selected. No error is shown — the spinner just spins indefinitely until the page is reloaded.Root Cause
packages/web/src/components/workflows/WorkflowExecution.tsxfetches the workflow definition (DAG topology) in a separateuseQueryfrom the run data, and there is no error state for the graph panel when that fetch fails or returns nothing:When that fetch fails (404, transient error, missing data),
dagDefinitionNodesstaysnull, the ternary picks the "Loading graph..." branch, and there is no error indicator, no retry button, and no fallback to the runtime DAG nodes already available ininitialData?.dagNodesfrom the events stream.Likely Triggers (ranked)
Workflow definition endpoint can 404 for globally-installed workflows.
GET /api/workflows/:nameatpackages/server/src/routes/api.ts:2164-2237searches project.archon/workflows/then bundled defaults — but does not check~/.archon/.archon/workflows/. The LIST endpoint includes the home scope (viadiscoverWorkflowsWithConfig), so the run appears in the dashboard, but clicking through to the detail page returns 404 → graph view hangs. Closely related to the marketplace install gap I'm filing separately.Polling crash with unguarded optional chain.
refetchIntervalcallback atWorkflowExecution.tsx:162-166accessesquery.state.data?.workflowState.status(missing?.beforeworkflowState). A partial response missingworkflowStatewould throw inside the callback. Same area as existing issue Workflow execution page can crash during polling #1598; PR fix(workflows): guard workflow execution polling state #1599 addresses the crash path but not the silent hang.React Query
staleTime: Infinitymakes failures sticky. Even if the definition endpoint recovers (network blip), the query never retries. The page stays in the "Loading..." state until a full reload.Proposed Fix
workflowDefquery is inerrorstatus — show the error message and a retry button.initialData?.dagNodes(runtime DAG from the events stream) when the definition fetch fails — runtime data is usually sufficient to draw the graph for an in-progress or completed run.~/.archon/.archon/workflows/) toGET /api/workflows/:nameinpackages/server/src/routes/api.ts:2164-2237so the detail endpoint matches the LIST endpoint's resolution rules.staleTime: Infinity(or pair it with explicit invalidation on error) so transient failures retry.Repro Hint
If hard to repro in dev, try running a workflow whose YAML lives at
~/.archon/.archon/workflows/(i.e., a globally-installed workflow). The run will appear in the dashboard but clicking through to the detail page will hang the graph view.File Citations
packages/web/src/components/workflows/WorkflowExecution.tsxworkflowDefquery fails — renders "Loading graph..." foreverpackages/web/src/components/workflows/WorkflowExecution.tsx?.inrefetchIntervalcallbackpackages/server/src/routes/api.tsGET /api/workflows/:namemissing home-scoped lookupRelated