Skip to content

fix(abort): wire /abort command and sessions.reset/delete to session-run-registry killSessionRun #2343

@alexey-pelykh

Description

@alexey-pelykh

Summary

The user-facing /abort command and the gateway sessions.reset / sessions.delete methods both lost their subprocess-termination semantics in the pi-embedded gut (b27cecc795, #76/#77). Both pre-gut call sites were rewiring points for ChannelBridge; they have a direct equivalent in src/agents/session-run-registry.ts (killSessionRun + waitForSessionRunEnd) but were never wired to it.

Tracked from the ChannelBridge compatibility audit on #2089.

Regression 1: /abort user command cannot terminate ChannelBridge runs

File: src/auto-reply/reply/abort.ts (REGRESSED via v2026.3.7 sync — imports abortEmbeddedPiRun at line 3)

Current behavior: When a user sends an abort-request message (e.g. /abort), tryFastAbortFromMessage at line 279 resolves the target session, attempts to cancel it via acpManager.cancelSession() (line 323) for ACP sessions, then calls abortEmbeddedPiRun(sessionId) (line 335) — which is a no-op stub. The return value { handled: true, aborted: false, ... } is propagated to the user with the aborted flag always false for non-ACP (channel-bridge) runs. The session gets flagged as abortedLastRun = true in the store, and queues are cleared, but the CLI subprocess keeps running.

Pre-gut behavior (abort.ts:231-234 in pre-gut diff): actually called abortEmbeddedPiRun(sessionId) which terminated the Pi run. Plus stopSubagentsForRequester at line 231 also called it for each subagent.

Expected fix:

// Replace line 3 import
import { killSessionRun, waitForSessionRunEnd } from "../../agents/session-run-registry.js";

// Line 335 (inside tryFastAbortFromMessage):
const aborted = sessionId ? killSessionRun(key ?? targetKey) : false;
// Optionally await settlement:
if (aborted) {
  await waitForSessionRunEnd(key ?? targetKey, 5_000);
}

// Also fix stopSubagentsForRequester (pre-gut line 231):
const aborted = killSessionRun(childKey);

Note the key shape change: session-run-registry keys on sessionKey (the canonical session identifier used by ChannelBridge in channel-bridge.ts:160), not sessionId (the Pi-era UUID stored in SessionEntry.sessionId).

Regression 2: sessions.reset / sessions.delete don't wait for in-flight subprocesses

File: src/gateway/server-methods/sessions.ts (REGRESSED via v2026.3.7 sync — imports abortEmbeddedPiRun, waitForEmbeddedPiRunEnd at line 6)

Current behavior: ensureSessionRuntimeCleanup at line 187 clears queues, clears bootstrap snapshot, stops subagents, and returns undefined (success) — without waiting for the target session's ChannelBridge CLI subprocess to terminate. This means sessions.reset can succeed and delete session state while a running subprocess is still mid-write to the session file.

Pre-gut behavior: Called abortEmbeddedPiRun(params.sessionId) + await waitForEmbeddedPiRunEnd(params.sessionId, 15_000); if the run didn't end within 15s, returned UNAVAILABLE with message "Session ${params.key} is still active; try again in a moment."

Expected fix:

// Replace line 6 import
import { killSessionRun, waitForSessionRunEnd } from "../../agents/session-run-registry.js";

// Line 187 (ensureSessionRuntimeCleanup):
if (!params.target.canonicalKey) {
  return undefined;
}
killSessionRun(params.target.canonicalKey);
const ended = await waitForSessionRunEnd(params.target.canonicalKey, 15_000);
if (ended) {
  return undefined;
}
return errorShape(
  ErrorCodes.UNAVAILABLE,
  `Session ${params.key} is still active; try again in a moment.`,
);

Note: existing tests in src/gateway/server.sessions.gateway-server-sessions-a.test.ts:680,921 are named "sessions.delete rejects main and aborts active runs" and "sessions.reset aborts active runs and clears queues" — they'll need updating to expect killSessionRun / waitForSessionRunEnd calls instead of the Pi-era mocks.

Regression 3: /session abort command (commands-session.ts)

File: src/auto-reply/reply/commands-session.ts (GUTTED — no pi-embedded imports currently)

Current behavior: applyAbortTarget at line ~103 (post-gut) no longer calls abortEmbeddedPiRun(abortTarget.sessionId). It still marks entry.abortedLastRun = true in the session store but doesn't actually terminate the subprocess.

Pre-gut behavior: Called abortEmbeddedPiRun(abortTarget.sessionId) before marking the abort flag.

Expected fix:

import { killSessionRun } from "../../agents/session-run-registry.js";

function applyAbortTarget(params: { ... }) {
  const { abortTarget } = params;
  if (abortTarget.key) {
    killSessionRun(abortTarget.key);
  }
  // ... existing abortedLastRun flag handling
}

Acceptance Criteria

  • abort.ts imports killSessionRun / waitForSessionRunEnd from session-run-registry
  • sessions.ts (server-methods) imports killSessionRun / waitForSessionRunEnd from session-run-registry
  • commands-session.ts applyAbortTarget calls killSessionRun(abortTarget.key)
  • User /abort command actually terminates the running CLI subprocess for non-ACP sessions
  • sessions.reset / sessions.delete wait up to 15s for subprocess termination before returning success, error UNAVAILABLE if timeout
  • pnpm check passes
  • Existing abort tests updated (they mock abortEmbeddedPiRun — see src/auto-reply/reply.abort* test files and src/gateway/server.sessions.*.test.ts)
  • Live smoke test: LIVE=1 pnpm test:live -- abort (or manual verification via a running CLI session)

Out of Scope

Related

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions