Skip to content

fix(agents): register ACP runs in subagent registry for lifecycle tracking#46308

Closed
lanyasheng wants to merge 3 commits into
openclaw:mainfrom
lanyasheng:fix/acp-spawn-register-subagent-run
Closed

fix(agents): register ACP runs in subagent registry for lifecycle tracking#46308
lanyasheng wants to merge 3 commits into
openclaw:mainfrom
lanyasheng:fix/acp-spawn-register-subagent-run

Conversation

@lanyasheng

Copy link
Copy Markdown

Summary

ACP sessions spawned via sessions_spawn(runtime=acp) are never registered in the subagent registry (subagent-registry.ts), making them invisible to the lifecycle event system. This is the root cause of several issues:

  • subagent_ended hooks never fire for ACP sessions
  • Completion notifications never reach the requester
  • Stale session records accumulate in ~/.acpx/sessions/index.json (zombie sessions)
  • maxConcurrentSessions can be exhausted by dead sessions

Changes

Add a registerSubagentRun() call in acp-spawn.ts after successful ACP dispatch, mirroring the existing pattern in subagent-spawn.ts. This enables:

  1. subagent_ended hook fires when ACP sessions complete or exit
  2. Proper cleanup via the existing archive/reaper pipeline
  3. Plugin support -- plugins (e.g. spawn-interceptor) can detect ACP completion through the standard event stream rather than relying on index.json polling

The registration uses cleanup: keep (ACP sessions manage their own workspace) and passes all available context (requesterOrigin, requesterDisplayKey, task, label, spawnMode) for full observability.

Test plan

  • Spawn an ACP session with sessions_spawn(runtime=acp) and verify subagent_ended hook fires upon completion
  • Verify ACP session appears in subagentRuns registry during execution
  • Verify existing subagent (non-ACP) spawn behavior is unchanged
  • Verify spawn-interceptor plugin receives ACP completion events via L1 native event stream

Fixes #40272

…cking

ACP sessions spawned via `sessions_spawn(runtime="acp")` were not registered
in the subagent registry, making them invisible to the lifecycle event system.
This meant `subagent_ended` hooks never fired for ACP sessions, preventing
completion notifications from reaching the requester and leaving stale session
records in `~/.acpx/sessions/index.json`.

Add `registerSubagentRun()` call after successful ACP dispatch, mirroring the
existing pattern in `subagent-spawn.ts`. This enables:

- `subagent_ended` hook to fire when ACP sessions complete or exit
- Proper cleanup of session state via the existing archive/reaper pipeline
- Plugins (e.g. spawn-interceptor) to detect ACP completion through the
  standard event stream rather than relying on index.json polling

Fixes openclaw#40272
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Mar 14, 2026
@greptile-apps

greptile-apps Bot commented Mar 14, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a registerSubagentRun() call in acp-spawn.ts after a successful ACP dispatch, mirroring the existing pattern in subagent-spawn.ts. This correctly makes ACP sessions visible to the subagent lifecycle/registry system so that subagent_ended hooks fire, completion notifications reach the requester, and zombie session accumulation is prevented.

  • The approach is correct and well-motivated — the mirrored pattern from subagent-spawn.ts is the right model.
  • The registerSubagentRun call is missing a try/catch guard. In subagent-spawn.ts, a throw from this function is caught and results in a graceful error return. Here, a throw would propagate out of spawnAcpDirect after the ACP session is already live, causing the caller to see a failure while the session is running — recreating orphaned sessions via a different path.
  • resolveMainSessionAlias(cfg) is called a second time to obtain mainKey/alias for the display key. These values are already derived inside resolveRequesterInternalSessionKey at the top of the function; surfacing them to the outer scope would eliminate the duplicate call.

Confidence Score: 3/5

  • Functionally safe to merge with low risk, but the missing try/catch creates a narrow path to post-dispatch exception propagation that should be addressed.
  • The change is well-scoped and the registration parameters are correct. The main concern is the absence of a try/catch around registerSubagentRun: since persistSubagentRunsToDisk already swallows its errors, the practical risk is low, but a thrown exception from loadConfig() or ensureListener() inside the registry call would still propagate out of spawnAcpDirect after a successful dispatch — the exact failure mode the PR is trying to prevent. The existing pattern in subagent-spawn.ts guards against this explicitly.
  • Pay attention to src/agents/acp-spawn.ts around the new registerSubagentRun block (lines 746–757) — specifically the missing error guard.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/acp-spawn.ts
Line: 740-745

Comment:
**Redundant `resolveMainSessionAlias` call**

`resolveMainSessionAlias(cfg)` is called here to obtain `mainKey` and `alias`, but these values are already derived at the very top of `spawnAcpDirect` inside `resolveRequesterInternalSessionKey` (line 413–416), which calls `resolveMainSessionAlias` internally. They are just not surfaced to the outer scope.

Consider either returning `mainKey`/`alias` from `resolveRequesterInternalSessionKey`, or calling `resolveMainSessionAlias(cfg)` once at the top of `spawnAcpDirect` (before calling `resolveRequesterInternalSessionKey`) so the values can be shared — matching the pattern in `subagent-spawn.ts` at line 319.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/acp-spawn.ts
Line: 746-757

Comment:
**Missing try/catch around `registerSubagentRun`**

The registration call is not wrapped in error handling. In `subagent-spawn.ts` (the pattern being mirrored at line 692–735), `registerSubagentRun` is wrapped in a `try/catch` that catches any thrown error and returns `{ status: "error", error: "Failed to register subagent run: ..." }` — preventing an unhandled exception from escaping the spawner function.

Without the same protection here, if `registerSubagentRun` throws (e.g. if `loadConfig()` fails inside it, or `ensureListener()` misbehaves), the exception will propagate out of `spawnAcpDirect` **even though the ACP session was already successfully dispatched** at lines 703–720. The caller would see an error response while the ACP session is actually live — creating exactly the class of orphaned/zombie sessions this PR aims to eliminate, just via a different code path.

Since the dispatch already succeeded, the right behavior on registration failure is to log a warning and continue returning the accepted response — not to throw:

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 5c35c02

Comment thread src/agents/acp-spawn.ts Outdated
Comment on lines +740 to +745
const { mainKey, alias } = resolveMainSessionAlias(cfg);
const requesterDisplayKey = resolveDisplaySessionKey({
key: requesterInternalKey,
alias,
mainKey,
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant resolveMainSessionAlias call

resolveMainSessionAlias(cfg) is called here to obtain mainKey and alias, but these values are already derived at the very top of spawnAcpDirect inside resolveRequesterInternalSessionKey (line 413–416), which calls resolveMainSessionAlias internally. They are just not surfaced to the outer scope.

Consider either returning mainKey/alias from resolveRequesterInternalSessionKey, or calling resolveMainSessionAlias(cfg) once at the top of spawnAcpDirect (before calling resolveRequesterInternalSessionKey) so the values can be shared — matching the pattern in subagent-spawn.ts at line 319.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/acp-spawn.ts
Line: 740-745

Comment:
**Redundant `resolveMainSessionAlias` call**

`resolveMainSessionAlias(cfg)` is called here to obtain `mainKey` and `alias`, but these values are already derived at the very top of `spawnAcpDirect` inside `resolveRequesterInternalSessionKey` (line 413–416), which calls `resolveMainSessionAlias` internally. They are just not surfaced to the outer scope.

Consider either returning `mainKey`/`alias` from `resolveRequesterInternalSessionKey`, or calling `resolveMainSessionAlias(cfg)` once at the top of `spawnAcpDirect` (before calling `resolveRequesterInternalSessionKey`) so the values can be shared — matching the pattern in `subagent-spawn.ts` at line 319.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread src/agents/acp-spawn.ts Outdated
Comment on lines +746 to +757
registerSubagentRun({
runId: childRunId,
childSessionKey: sessionKey,
controllerSessionKey: requesterInternalKey,
requesterSessionKey: requesterInternalKey,
requesterOrigin,
requesterDisplayKey,
task: params.task,
cleanup: "keep",
label: params.label || undefined,
spawnMode,
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing try/catch around registerSubagentRun

The registration call is not wrapped in error handling. In subagent-spawn.ts (the pattern being mirrored at line 692–735), registerSubagentRun is wrapped in a try/catch that catches any thrown error and returns { status: "error", error: "Failed to register subagent run: ..." } — preventing an unhandled exception from escaping the spawner function.

Without the same protection here, if registerSubagentRun throws (e.g. if loadConfig() fails inside it, or ensureListener() misbehaves), the exception will propagate out of spawnAcpDirect even though the ACP session was already successfully dispatched at lines 703–720. The caller would see an error response while the ACP session is actually live — creating exactly the class of orphaned/zombie sessions this PR aims to eliminate, just via a different code path.

Since the dispatch already succeeded, the right behavior on registration failure is to log a warning and continue returning the accepted response — not to throw:

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/acp-spawn.ts
Line: 746-757

Comment:
**Missing try/catch around `registerSubagentRun`**

The registration call is not wrapped in error handling. In `subagent-spawn.ts` (the pattern being mirrored at line 692–735), `registerSubagentRun` is wrapped in a `try/catch` that catches any thrown error and returns `{ status: "error", error: "Failed to register subagent run: ..." }` — preventing an unhandled exception from escaping the spawner function.

Without the same protection here, if `registerSubagentRun` throws (e.g. if `loadConfig()` fails inside it, or `ensureListener()` misbehaves), the exception will propagate out of `spawnAcpDirect` **even though the ACP session was already successfully dispatched** at lines 703–720. The caller would see an error response while the ACP session is actually live — creating exactly the class of orphaned/zombie sessions this PR aims to eliminate, just via a different code path.

Since the dispatch already succeeded, the right behavior on registration failure is to log a warning and continue returning the accepted response — not to throw:

How can I resolve this? If you propose a fix, please make it concise.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5c35c02005

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/acp-spawn.ts Outdated
Comment on lines +746 to +750
registerSubagentRun({
runId: childRunId,
childSessionKey: sessionKey,
controllerSessionKey: requesterInternalKey,
requesterSessionKey: requesterInternalKey,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Tag ACP lifecycle runs with ACP target kind

This new registerSubagentRun(...) path records ACP spawns as generic subagent runs, but the completion hook emitter currently hard-codes targetKind: "subagent" (src/agents/subagent-registry-completion.ts:72). Since ACP events are now flowing through this registry path, plugins/hooks that rely on the supported ACP discriminator (src/agents/subagent-lifecycle-events.ts:1-6) cannot detect ACP completions correctly and will treat them as normal subagent endings.

Useful? React with 👍 / 👎.

…ainSessionAlias

- Wrap registerSubagentRun() in try/catch so registration failure after
  successful dispatch does not propagate as error to caller (the ACP session
  is already live at that point)
- Refactor resolveRequesterInternalSessionKey to return mainKey/alias,
  eliminating a redundant resolveMainSessionAlias(cfg) call
@lanyasheng

Copy link
Copy Markdown
Author

Thanks for the thorough review! Both issues addressed in f89aee6:

  1. try/catch added: registerSubagentRun is now wrapped in a try/catch. On failure, it logs a warning and continues returning the accepted response -- the ACP session is already live, so surfacing an error would create exactly the orphan scenario we are fixing.

  2. Deduplicated resolveMainSessionAlias: Refactored resolveRequesterInternalSessionKey to return { key, mainKey, alias } instead of just the key string, so the caller can reuse mainKey/alias without a second call.

Add two tests to acp-spawn.test.ts:
- Verify registerSubagentRun is called after successful ACP dispatch
  with correct runId, childSessionKey, task, and cleanup parameters.
- Verify spawnAcpDirect still returns success when registerSubagentRun
  throws (the try/catch guard must not propagate registration failures).

@BenliusYang BenliusYang left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks exactly like what we need for the ACP completion relay issue #49782, please prioritize review.

@BenliusYang BenliusYang left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks exactly like what we need for the ACP completion relay issue #49782, please prioritize review.

@vincentkoc

Copy link
Copy Markdown
Member

Closing as duplicate of the ACP run-registration path now consolidated under #53125.

This PR identified the same missing registerSubagentRun() seam in acp-spawn.ts. I am keeping the broader canonical open and folding duplicate discussion there so contributor credit stays attached to the surviving thread.

@vincentkoc vincentkoc closed this Mar 27, 2026
@vincentkoc

Copy link
Copy Markdown
Member

Closing as duplicate of the ACP run-registration path now consolidated under #53125.

This PR identified the same missing registerSubagentRun() seam in acp-spawn.ts. I am keeping the broader canonical open and folding duplicate discussion there so contributor credit stays attached to the surviving thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ACP sub-agent completions don't notify the requester channel (Discord/WhatsApp/etc)

3 participants