Skip to content

fix(gateway): reuse subagent registry snapshot in session listing#75013

Closed
anyech wants to merge 0 commit intoopenclaw:mainfrom
anyech:fix/sessions-list-subagent-registry-cache
Closed

fix(gateway): reuse subagent registry snapshot in session listing#75013
anyech wants to merge 0 commit intoopenclaw:mainfrom
anyech:fix/sessions-list-subagent-registry-cache

Conversation

@anyech
Copy link
Copy Markdown
Contributor

@anyech anyech commented Apr 30, 2026

Summary

This PR adds a request-scoped subagent registry read context for Gateway session listing.

It reduces repeated registry snapshot/stat/index/scan work during sessions.list row construction by building one per-request read context and reusing it for:

  • spawnedBy filtering
  • subagent display-row lookup
  • child session resolution
  • active descendant counting

The context is deliberately request-scoped and is not cached globally or persisted.

Relationship to #74970 / slice 1

#74970 adds regression coverage for the slice-1 behavior now present on main: safe sessions.list paths filter/sort/limit entries before expensive row enrichment.

This PR is slice 2. Slice 1 avoids enriching rows that will be dropped. Slice 2 reduces repeated subagent-registry snapshot/stat/scan work for rows that still need enrichment, including selected rows and legacy spawnedBy / subagent child-resolution paths.

Future slice 3 work, if needed, should stay separate and focus on transcript fallback caching and/or session-store metadata dedupe.

Why

Before this change, session listing could recreate subagent registry read snapshots and scan registry state repeatedly while filtering and enriching rows. This is especially visible in spawnedBy and child-session enrichment paths.

This PR keeps the same request-level behavior but shares one point-in-time registry read context through the relevant call chain.

Validation

All validation used isolated source checkout and synthetic/copied lab data only. No live Gateway, live config, installed dist, or live session store was modified or benchmarked.

Local checks:

  • OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs run --config test/vitest/vitest.gateway.config.ts src/gateway/session-utils.subagent.test.ts — 24 tests passed
  • OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs run --config test/vitest/vitest.unit-fast.config.ts src/agents/subagent-registry-read-context.test.ts src/agents/subagent-registry-queries.test.ts — 23 tests passed
  • node scripts/run-oxlint.mjs --tsconfig tsconfig.oxlint.core.json src/agents/subagent-registry-read.ts src/agents/subagent-registry-read-context.test.ts src/gateway/session-utils.ts src/gateway/session-utils.subagent.test.ts — passed
  • corepack pnpm tsgo:core:test — passed
  • corepack pnpm build — passed

Synthetic benchmark (240 entries, 80 selected rows):

case registry stat calls registry file reads elapsed
before 641 1 11676.507 ms
after 1 1 9519.779 ms

The primary claim is reduced registry snapshot/stat/index/scan work, not guaranteed cold physical disk-I/O reduction. File contents may already be served by OS/runtime caching depending on environment.

Notes / caveats

  • The context is intentionally request-scoped. It represents one point-in-time registry snapshot for one bounded read operation and should not be hoisted into cross-request/global state.
  • If the registry changes during a single sessions.list request, that request sees the snapshot captured when the context was created. This avoids inconsistent repeated snapshots within one request and matches the intended read-boundary tradeoff.
  • Direct one-row loads still create their own one-call context; they do not share contexts across independent calls.
  • Persistent registry format/semantics are unchanged.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling size: L labels Apr 30, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 30, 2026

Codex review: needs maintainer review before merge.

What this changes:

This PR adds a request-scoped subagent registry read context and threads it through Gateway session listing and row construction, with focused agent and Gateway tests for parity and reduced registry reads.

Maintainer follow-up before merge:

This is an open implementation PR with no concrete repair finding from this review; the remaining action is maintainer review plus completed CI, not an automated replacement or fix PR.

Security review:

Security review cleared: The diff only changes TypeScript registry/session-list read paths and tests; it does not add dependencies, scripts, CI permissions, artifact downloads, or secret handling.

Review details

Best possible solution:

Land a request-scoped registry read context if CI remains green and maintainers are comfortable with the snapshot boundary; keep it scoped to one sessions.list/read operation and avoid hoisting it into global cache state.

Do we have a high-confidence way to reproduce the issue?

Yes. The current-main reproduction path is code-level: sessions.list spawnedBy filtering and row enrichment call registry snapshot-backed helpers repeatedly, and the PR adds a focused fs.statSync-based regression test for the repeated disk snapshot case.

Is this the best way to solve the issue?

Yes, likely. A per-request context is the narrowest maintainable optimization because it avoids global caching or persistence changes while preserving the existing direct row fallback behavior.

What I checked:

  • Current main still uses per-call helpers in sessions.list: On current main, listSessionsFromStore and buildGatewaySessionRow call getSessionDisplaySubagentRunByChildSessionKey, countActiveDescendantRuns, and listSubagentRunsForController directly rather than sharing a request-scoped context. (src/gateway/session-utils.ts:1660, 099037cca6f5)
  • Current registry read helpers snapshot on each call: The current read facade wraps getSubagentRunsSnapshotForRead inside each exported query helper, which matches the repeated snapshot/stat work the PR is trying to reduce. (src/agents/subagent-registry-read.ts:23, 9d68c6768ae2)
  • PR introduces request-scoped context: The patch adds SubagentRegistryReadContext plus createSubagentRegistryReadContextFromRuns/createSubagentRegistryReadContext and uses the context for controller listing, display lookup, and active descendant counts. (src/agents/subagent-registry-read.ts:21, 2bf9f982fb8d)
  • PR threads context through Gateway row construction: The patch passes one lazy getSubagentRegistry() result through spawnedBy filtering and buildGatewaySessionRow, while preserving direct one-row loads by creating a context only when no context is supplied. (src/gateway/session-utils.ts:1628, 2bf9f982fb8d)
  • Regression coverage added: The PR adds parity tests for the new read context and Gateway tests that assert one registry stat for a spawnedBy listing and no registry read when raw filters drop all rows. (src/agents/subagent-registry-read-context.test.ts:1, 2bf9f982fb8d)
  • CI status snapshot: GitHub check-runs for the PR head showed 63 successes, 12 skipped, 1 neutral CodeQL result, 6 in progress, and only auto-response cancelled at review time. (2bf9f982fb8d)

Likely related people:

  • @steipete: Recent commits touched both Gateway session list behavior and shared subagent registry query helpers, including the slice-1 session-list enrichment optimization and registry query helper refactor. (role: recent maintainer and central feature owner; confidence: high; commits: 9402bca614d0, 3b2db583cdb1, 757aee4cdd5d; files: src/gateway/session-utils.ts, src/agents/subagent-registry-read.ts, src/agents/subagent-registry-queries.ts)
  • @Takhoffman: History for the subagent registry query path includes fixes for restarted descendant counts and moved child sessions, which are semantics this PR explicitly preserves in the indexed context tests. (role: adjacent subagent traversal owner; confidence: medium; commits: c541cde0f66e, e48a0b80a81b; files: src/agents/subagent-registry-queries.ts, src/gateway/session-utils.subagent.test.ts)
  • @ZiPengWei: Recent merged work touched subagent recovery and session state reconciliation around the registry read files affected by this PR. (role: recent subagent registry maintainer; confidence: medium; commits: 845040214e13; files: src/agents/subagent-registry-read.ts, src/agents/subagent-registry-state.ts)

Remaining risk / open question:

  • Some PR check runs were still in progress at review time, so final merge readiness depends on the completed CI set.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 099037cca6f5.

@vincentkoc
Copy link
Copy Markdown
Member

Restored the patch in replacement draft PR #75019 after the fork branch was accidentally pushed to the current base during CI repair and GitHub auto-closed this PR. The replacement keeps the same change set and credits this PR as the source thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling gateway Gateway runtime size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants