Skip to content

Session file lock leak when user manually aborts agent (non-timeout abort never releases lock) #88600

@williammu

Description

@williammu

Description

When a user manually stops an agent, the session file lock is never released, causing subsequent turn attempts to fail with SessionWriteLockTimeoutError for 60+ seconds.

Steps to Reproduce

  1. Start any agent conversation
  2. While agent is mid-response (e.g., waiting on LLM API), manually stop the agent
  3. Immediately try to send another message to the same agent

Expected Behavior

Agent should recover and accept new messages immediately after abort.

Actual Behavior

Root Cause

File: selection-C4e-Qn9W.js (bundled), function abortRun:

const abortRun = (isTimeout = false, reason) => {
    aborted = true;
    // ...
    abortCompaction();
    abortActiveSession();
    // BUG: releaseHeldLockForAbort() only called for timeout abort
    if (isTimeout) sessionLockController.releaseHeldLockForAbort().catch(...)
};

releaseHeldLockForAbort() is guarded by if (isTimeout). When user aborts manually, isTimeout=false, so the lock is never released.

Failure Chain

  1. User manual abort → abortRun(isTimeout=false)releaseHeldLockForAbort() SKIPPED → lock remains held
  2. Run cleanup → acquireForCleanup()acquireCleanupLock()takeHeldLockAfterRetainedIdle() fails (lock in use)
  3. Falls back to acquireLock() → lock already held → waits 60s → SessionWriteLockTimeoutError
  4. cleanupEmbeddedAttemptResources() never reached → lock NEVER released
  5. Only freed by watchdog (maxHoldMs, default 300s) or gateway restart

Relevant Logs

Suggested Fix

- if (isTimeout) sessionLockController.releaseHeldLockForAbort().catch(...);
+ sessionLockController.releaseHeldLockForAbort().catch((err) => {
+     log.warn("failed to release session lock on abort: runId=" + params.runId + " " + String(err));
+ });

Environment

  • OpenClaw version: 2026.5.27
  • Node: v22.22.1
  • OS: Linux (Ubuntu 7.0.0-15-generic)
  • Provider: DeepSeek (deepseek-v4-pro)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions