Skip to content

bug: concurrent-run guard has TOCTOU race with pre-created pending rows #1036

@Wirasm

Description

@Wirasm

Summary

The concurrent-run guard in executor.ts can be bypassed when two workflow runs are launched nearly simultaneously on the same cwd. Pre-created rows have status='pending' at the time the guard checks for 'running'/'paused', so both runs pass the guard.

Root Cause

The sequence:

Run A:  orchestrator pre-creates row (status='pending')
Run B:  orchestrator pre-creates row (status='pending')
Run A:  executeWorkflow → getActiveWorkflowRunByPath → sees no 'running' row → passes guard
Run B:  executeWorkflow → getActiveWorkflowRunByPath → sees no 'running' row → passes guard
Run A:  updateWorkflowRun(status='running')
Run B:  updateWorkflowRun(status='running')
-- Both runs now executing concurrently on the same cwd --

The guard at packages/workflows/src/executor.ts:324 queries:

SELECT * FROM remote_agent_workflow_runs
WHERE working_path = $1 AND status IN ('running', 'paused')
ORDER BY started_at DESC LIMIT 1

But the row is inserted with status='pending' at packages/core/src/orchestrator/orchestrator.ts:346, and only transitions to 'running' later at executor.ts:535.

Impact

  • Currently mitigated by worktree isolation — each run gets a unique cwd, so the guard condition (same working_path) is never hit in practice.
  • Would manifest if two workflows are run on the same repo with --no-worktree, or if worktree creation fails and falls back to the source directory.
  • When two runs share a cwd, bash/script nodes writing to relative paths can cross-contaminate.

Possible Fixes

  1. Include 'pending' in the guard query's status filter
  2. Set status to 'running' at pre-creation time instead of 'pending'
  3. Use a database-level advisory lock or INSERT ... ON CONFLICT to serialize the guard check + status update atomically

Files

  • packages/workflows/src/executor.ts:324 — guard query
  • packages/workflows/src/executor.ts:535 — status transition to 'running'
  • packages/core/src/orchestrator/orchestrator.ts:346 — pre-creation with 'pending'
  • packages/core/src/db/workflows.ts:190-192getActiveWorkflowRunByPath SQL

Discovered During

Investigation of concurrent workflow runs (#995 investigation side-quest). Confirmed no data corruption occurred — worktree isolation prevented the race from manifesting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions