fix: Add runner label to /status#70595
Conversation
Greptile SummaryThis PR adds a Confidence Score: 5/5Safe to merge — only one minor test-clarity P2 concern with no production impact. All findings are P2 (test config readability). The production logic is correct and well-structured; no runtime bugs identified. No files require special attention beyond the minor test config clarity note in src/auto-reply/status.test.ts. Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/auto-reply/status.test.ts
Line: 120-132
Comment:
**Test config doesn't drive CLI detection**
The `config` passed here has `models.providers["claude-cli"]` set, but `isCliProvider` never looks at `config.models.providers`. It checks `cfg?.agents?.defaults?.cliBackends`, then `resolveRuntimeCliBackends()`, then `resolvePluginSetupCliBackendRuntime()`. If the test is green, CLI detection is happening via one of those runtime paths — not via the config object above. That makes the `config` property misleading: a reader might reasonably believe it's what makes `"claude-cli"` a CLI provider in this test, but it isn't. Consider either removing the spurious `models.providers` key or adding a `agents.defaults.cliBackends: { "claude-cli": {} }` entry that actually exercises the config-driven branch.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "Add runner label to status output" | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5e6ffbabaa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…abel # Conflicts: # CHANGELOG.md
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 52905d76cc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const acpAgentRaw = normalizeOptionalString(args.sessionEntry?.acp?.agent); | ||
| const acpAgent = acpAgentRaw ? sanitizeTerminalText(acpAgentRaw) : undefined; | ||
| if (acpAgent) { | ||
| const backendRaw = normalizeOptionalString(args.sessionEntry?.acp?.backend); |
There was a problem hiding this comment.
Preserve ACP runner label when agent is absent
resolveRunnerLabel only treats a session as ACP-backed when sessionEntry.acp.agent sanitizes to a non-empty value, so ACP entries with missing/blank agent metadata fall through to provider detection and can be mislabeled as pi (embedded) or CLI. That case is realistic because ACP runtime code already guards for missing meta.agent by deriving it from the session key (src/acp/control-plane/manager.core.ts:1377). The runner label should stay in the ACP path whenever ACP metadata exists, even if the stored agent string is empty.
Useful? React with 👍 / 👎.
* Add runner label to status output * Add changelog entry for status runner label * Fix status runner detection and sanitization (cherry picked from commit 03477cc)
* Add runner label to status output * Add changelog entry for status runner label * Fix status runner detection and sanitization (cherry picked from commit 03477cc)
…e trap-class Three confirmed instances from #325 phase-2 rebase: - e515ea1 (gateway live test hardening; base has richer probes via f07b00d+a53fea3905+5f702b464b) - aa1908b (docker live backend; same parallel-evolution pattern) - 7ee46a3 (PR openclaw#70595 already in base CHANGELOG; base evolved naming + lookup table) Discovery channels: git cherry (insufficient), CHANGELOG-byte-grep (high precision), conflict-content rubric. Tracking: #331 Eats dogfood for: openclaw-bootstrap#702
…y primitives)
Wires the §1 entry-point it.todo placeholder
swim-runner.test.ts L80: "rebase bot classifies synthetic squash-rebased
commit as DROP (not PICK)"
onto runnable ground by composing the two §1 discovery primitives
(changelog-grep + cherry-pick-provenance) into a single DROP/PICK/REVIEW
verdict with channel attribution and structured evidence for downstream
span emission.
API:
- classifyRebasePick({subject, commitBody, baseChangelog, isAncestorOf})
returns {verdict: DROP|PICK|REVIEW, channel, evidence,
needsConflictContentInspection?}
Channel ordering (memo-aligned, deterministic):
1. cherry-pick-provenance (highest precision: exact source SHA,
ancestor-of-base check unambiguous)
2. changelog-grep:pr (high precision: PR tokens rarely collide)
3. changelog-grep:subject (medium precision: substring, can collide)
Load-bearing invariant: when both positive-signal channels miss, the
verdict is REVIEW with needsConflictContentInspection=true \u2014 NOT a
silent default to PICK. The memo's third channel (conflict-content
classification rubric) needs file-path + diff inspection, not pure
string\u2192struct, so it's NOT covered here; classifier surfaces the gap
explicitly so the caller can hand off rather than misclassify the
test-harness divergence cases.
Evidence completeness: even when one channel is load-bearing for the
verdict, ALL channels' evidence is recorded in the result so the
downstream span emission can carry full audit data. Non-ancestor
cherry-pick footers go to a separate
slot for cross-branch provenance audit.
Boundary: pure function. Caller drives:
- subject from `git log -1 --format='%s' <pick>`
- body from `git show -s --format=%B <pick>`
- base CHANGELOG from `git show <base>:CHANGELOG.md`
- isAncestorOf wrapping `git merge-base --is-ancestor`
Coverage (10 tests):
- Instance 1 (openclaw#70595): DROP via changelog-grep:pr; no
needsConflictContentInspection on DROP
- Instance 3 (cherry-pick): DROP via cherry-pick-provenance;
precedence over CHANGELOG when both fire; non-ancestor footer
doesn't trigger DROP via that channel
- Instance 2 (test-harness): REVIEW with
needsConflictContentInspection=true; classifier never defaults to
PICK on absence-of-signal
- subject-channel: DROP via changelog-grep:subject
- evidence completeness: both channels' evidence recorded when both
fire; non-ancestor footers separated from load-bearing slot
Result: 10/10 passing this file. swim-37 vitest project: 79 passed |
18 todo (6 files). pnpm lint:core: 0/0.
Closes part of #324; third of three §1 todos wired to runnable ground.
…el (#408) * feat(swim-37): #324 §1 trap-class CHANGELOG-byte-grep discovery channel Wires swim-37 trap-class §1 (parallel-evolution / cherry-false-negative) discovery channel onto runnable ground. Pure function, byte-pinned to studies/swim-37/traps/parallel-evolution-class.md §2 byte-walk against v2026.4.24 CHANGELOG L164 (instance `7ee46a3ab9 fix: Add runner label to /status (openclaw#70595)`). API: - grepChangelog(subject, content) — literal grep -F semantics, returns ALL hits (memo flags subject-line collisions as known false-positive mode; channel returns all so caller can decide) - extractPrNumberToken(subject) — pulls trailing `(#N)` PR-token, last match wins (commits sometimes cite earlier PRs in body but subject's trailing PR is the landing PR) - discoverChangelogHit(subject, content) — composite: PR-token first, fall through to full-subject; needle field tells caller which channel found the hit (or which was attempted last on miss) Wires trap §1 it.todo: swim-runner.test.ts L81: "CHANGELOG-byte-grep discovery channel emits drop-with-reason span" …onto runnable ground without depending on captureSwim() shim. The span-emission piece is downstream (callsite needs to import this and call emitContinuationDisabledSpan/equivalent with discovery.channel attribute) but the discovery primitive is now testable in isolation. Coverage (18 tests): - subject-path: full subject, collision (multi-hit), empty subject (defense-in-depth: empty must NOT match-everything), whitespace, empty changelog, literal grep -F (no regex interpretation) - PR-token extract: trailing #N, last-of-multiple, none, bare #N (must miss \u2014 only `(#N)` form), empty parens - composite: load-bearing PR-channel hit on openclaw#70595, fallback to subject for harness-divergence instances, NO-hit for harness-divergence (memo says CHANGELOG silent for these) - channel attribution: needle=PR-token vs needle=subject distinguishes channels for downstream span attrs Boundary: pure string-in, structured-out. No git, no fs, no network. Caller drives `git show <base>:CHANGELOG.md` and `git log -1 --format='%s' <commit>`. Discipline matches the helper- tier contract pattern from PR #406 (94fc8d1). Result: 18/18 passing. swim-37 vitest project: 52 passed | 18 todo (4). pnpm lint:core: 0/0. Closes part of #324; wires §1 it.todo placeholder. * feat(swim-37): #324 §1 cross-cut cherry-pick provenance parser Adds the SECOND §1 discovery channel from the trap-class memo cross-cut (L168 of studies/swim-37/traps/parallel-evolution-class.md). Where CHANGELOG-byte-grep catches the rebase-source's WORK already in base, cherry-pick-provenance catches the rebase-source's COMMIT already in base — by parsing the `(cherry picked from commit <sha>)` footer that git appends. The trap-class memo notes Instance 3 (`aa1908bf38`) carries this footer pointing to `9dd097a7a5...`, which would have been caught deterministically before conflict triage if the parser existed. API: - parseCherryPickProvenance(commitBody) — returns ALL footers in body order; multi-pick provenance accumulates when commits cross multiple branches; caller decides which is load-bearing - lastCherryPickSourceSha(commitBody) — convenience for the common case (most recent pick in chain) Boundary: pure parser. Caller drives `git show -s --format=%B` to get body and `git merge-base --is-ancestor <src> <base>` to make the ancestor decision. Coverage (17 tests): - byte-pinned aa1908b footer (40-hex) - multi-footer in body order (cross-branch picks) - trailing-period acceptance - short-SHA acceptance (7+ hex, git's unique-floor) - non-hex rejection, too-short rejection - anchored-line discipline (sentence with phrase ≠ footer) - leading-whitespace tolerance (indented footer blocks) - empty body, no-footer body - revert-footer NOT confused with pick-footer - evidence preservation (line field for downstream span attrs) Companion to chunk-1 of this PR (changelog-grep). Together they cover the two positive-signal channels from the §1 memo. The conflict-content classification rubric (third channel) is harder to cover as a pure function — needs file-path + diff inspection — so it's left for a later slice with the right shape. Result: 17/17 passing this file. swim-37 vitest project: 69 passed | 18 todo (5 files). pnpm lint:core: 0/0. Closes more of #324; second of two §1 discovery primitives. * feat(swim-37): #324 §1 entry-point classifier (composes both discovery primitives) Wires the §1 entry-point it.todo placeholder swim-runner.test.ts L80: "rebase bot classifies synthetic squash-rebased commit as DROP (not PICK)" onto runnable ground by composing the two §1 discovery primitives (changelog-grep + cherry-pick-provenance) into a single DROP/PICK/REVIEW verdict with channel attribution and structured evidence for downstream span emission. API: - classifyRebasePick({subject, commitBody, baseChangelog, isAncestorOf}) returns {verdict: DROP|PICK|REVIEW, channel, evidence, needsConflictContentInspection?} Channel ordering (memo-aligned, deterministic): 1. cherry-pick-provenance (highest precision: exact source SHA, ancestor-of-base check unambiguous) 2. changelog-grep:pr (high precision: PR tokens rarely collide) 3. changelog-grep:subject (medium precision: substring, can collide) Load-bearing invariant: when both positive-signal channels miss, the verdict is REVIEW with needsConflictContentInspection=true \u2014 NOT a silent default to PICK. The memo's third channel (conflict-content classification rubric) needs file-path + diff inspection, not pure string\u2192struct, so it's NOT covered here; classifier surfaces the gap explicitly so the caller can hand off rather than misclassify the test-harness divergence cases. Evidence completeness: even when one channel is load-bearing for the verdict, ALL channels' evidence is recorded in the result so the downstream span emission can carry full audit data. Non-ancestor cherry-pick footers go to a separate slot for cross-branch provenance audit. Boundary: pure function. Caller drives: - subject from `git log -1 --format='%s' <pick>` - body from `git show -s --format=%B <pick>` - base CHANGELOG from `git show <base>:CHANGELOG.md` - isAncestorOf wrapping `git merge-base --is-ancestor` Coverage (10 tests): - Instance 1 (openclaw#70595): DROP via changelog-grep:pr; no needsConflictContentInspection on DROP - Instance 3 (cherry-pick): DROP via cherry-pick-provenance; precedence over CHANGELOG when both fire; non-ancestor footer doesn't trigger DROP via that channel - Instance 2 (test-harness): REVIEW with needsConflictContentInspection=true; classifier never defaults to PICK on absence-of-signal - subject-channel: DROP via changelog-grep:subject - evidence completeness: both channels' evidence recorded when both fire; non-ancestor footers separated from load-bearing slot Result: 10/10 passing this file. swim-37 vitest project: 79 passed | 18 todo (6 files). pnpm lint:core: 0/0. Closes part of #324; third of three §1 todos wired to runnable ground. --------- Co-authored-by: Ronan 🌊 <ronan@solidor.io>
Now that #408 (PR #408 merged at cb73bc8) landed `rebase-classifier.ts` with the pure `classifyRebasePick` composer, the integration-tier \u00a71 entry-point it.todo on swim-runner.test.ts L80 is wireable without any new shim. Synthetic commit pinned: subject mentions PR openclaw#70595, base CHANGELOG contains the same PR token (squash-rebase shape from the memo's 7ee46a3 anchor instance). Discovery channel: changelog-grep:pr. L81 (CHANGELOG-byte-grep emits drop-with-reason span) stays todo \u2014 needs the production callsite that pairs the channel with `emitContinuationDisabledSpan`, per \ud83c\udf0a's #408 note. Test delta: Before: 20 passed | 16 todo (36) After: 21 passed | 15 todo (36) +1 live | -1 todo
…405) * test(swim-37): wire captureSwim() for continue_work primitive Replaces the placeholder `declare function captureSwim` in `studies/swim-37/harness/swim-runner.test.ts` with a real implementation in `studies/swim-37/harness/swim-runner.ts`. The wired path drives `emitContinuationWorkSpan` against `createInMemorySpanRecorder()` and returns the captured spans + the synthesized `chainId` (uuid v7 via `generateChainId`). Flips two prior `it.todo` markers to live tests: - emits continuation.work span with chain.id stamped (#366) - span carries chain.step.remaining attribute Plus three new live tests: - sentinel: captureSwim() is wired for continue_work - refuses continue_delegate / heartbeat / lich primitives with a clear error so the spec's remaining `it.todo` markers stay honest about what is implemented - repeated calls don't leak capture state between invocations STDOUT-only discipline preserved: no BasicTracerProvider, no OTLP exporter, no @opentelemetry/sdk-trace-base machinery — capture flows through `setContinuationTracer(recorder.tracer)` as documented in README.md. Other primitives (`continue_delegate`, `heartbeat`, lich-shape) remain `it.todo` until the corresponding dispatch / heartbeat / compaction-release seams have a comparable single-helper entry point we can drive synthetically. Test results: - swim-37 vitest project: 20 passed, 16 todo, 0 failed - in-memory-span-recorder.test.ts unchanged (12/12 still pass) - tsgo errors visible elsewhere in tree are pre-existing (src/plugin-sdk/provider-tools.ts), not introduced by this change Refs #324 (swim-37 harness) — `it.todo` count drops from 17 \u2192 16, first concrete primitive driven through the live tracer registry. * docs(swim-37): continue_delegate wiring memo (memo-before-wire) Pre-PR design memo for the next swim-37 primitive after #405 (continue_work). Decides shape before the wire so the PR lands clean. Resolved by reading production code: Q1 — real setTimeout vs fake timers N/A for dispatch-accept (synchronous, pre-timer); banked for a separate continue_delegate_fire swim. Cohort sign-off requested on: Q2 — recipient fan-out shape: N spans sharing chain.id (matches chunk-3 cohort design pin + #355 Stage-2 budget semantics). Q3 — delegate.mode x delegate.delivery: 8-cell it.each matrix + omission-contract row. Banked for follow-up: Q4 — continue_delegate_fire as separate primitive, not sub-case; informs SwimPrimitive type-union shape. Standard applied (per Cael 2026-04-27 #1498505918, figs #1498505870): would skipping this memo cost a chunk's worth of rework? Yes — three open Qs would have changed file shape. Refs: - #405 (continue_work wired) - #324 (swim-37 harness) - docs/design/334-slice2-chunk5b-delegate-fire-memo.md (pattern) * test(swim-37): wire captureSwim() for continue_delegate primitive Implements the design from `docs/design/swim-37-continue-delegate-wiring-memo.md` (commit 3bb086c762, same branch). Wiring (swim-runner.ts): - New continue_delegate case in the switch. - Adds CaptureSwimOptions axes: recipients, delivery, delegateMode. - Drives emitContinuationDelegateSpan once per recipient with shared chain.id (per chunk-3 cohort design pin: N recipients = N spans sharing chain.id, NOT one span with a recipients list). - Validates recipients is a positive integer. Tests (swim-runner.test.ts): - 8-cell it.each matrix: delegate.mode (normal | silent | silent-wake | post-compaction) x delegate.delivery (immediate | timer). - Omission contract: delegate.mode attribute absent when caller passes undefined. - Fan-out: 3 recipients emit 3 spans with shared chain.id. - Validation: recipients=0 and recipients=1.5 reject. - Removes the prior 'refuses continue_delegate' test row (now wired). Two it.todo markers preserved for downstream concerns: - chain-budget decrement / ChainBudget.declineToCarry observable (needs ChainBudget integration, not in scope here). - fan-out budget arithmetic per #355 Stage-2 (1 step not N) \u2014 separate from span cardinality. Test results delta: Before: 8 passed | 16 todo (24 tests) After: 19 passed | 15 todo (34 tests) +11 live | -1 todo Refs: - #324 swim-37 harness - #405 captureSwim continue_work (parent of this branch) - docs/design/swim-37-continue-delegate-wiring-memo.md * test(swim-37): pin recipient.index gap on continue_delegate fan-out Per Cael's caution on the wiring memo (Discord msg 1498507232185286849 sign-off): N spans sharing chain.id risks collapsing into analytic mush at scale unless per-recipient distinction is visible in attrs. Currently emitContinuationDelegateSpan exposes no recipient.index axis \u2014 all N spans in a fan-out are byte-identical except spanId. Adds: - One live GAP-PIN test asserting the current (deficient) reality: fan-out spans toEqual on attributes. This will fail loudly when the production helper grows the axis, prompting a flip to not.toEqual. - One it.todo marker for the upcoming production-helper change. Production-helper change is out of scope for this PR; will file follow-up issue. The gap-pin lives in the integration tier so it shows up in any code-review touching delegate-dispatch span shape. Refs: - #324 swim-37 harness - #405 captureSwim continue_work + continue_delegate (this PR) - docs/design/swim-37-continue-delegate-wiring-memo.md (Q2) - Cael Discord sign-off msg 1498507232185286849 * test(swim-37): wire \u00a71 entry-point todo onto classifyRebasePick Now that #408 (PR #408 merged at cb73bc8) landed `rebase-classifier.ts` with the pure `classifyRebasePick` composer, the integration-tier \u00a71 entry-point it.todo on swim-runner.test.ts L80 is wireable without any new shim. Synthetic commit pinned: subject mentions PR openclaw#70595, base CHANGELOG contains the same PR token (squash-rebase shape from the memo's 7ee46a3 anchor instance). Discovery channel: changelog-grep:pr. L81 (CHANGELOG-byte-grep emits drop-with-reason span) stays todo \u2014 needs the production callsite that pairs the channel with `emitContinuationDisabledSpan`, per \ud83c\udf0a's #408 note. Test delta: Before: 20 passed | 16 todo (36) After: 21 passed | 15 todo (36) +1 live | -1 todo
…el (#408) * feat(swim-37): #324 §1 trap-class CHANGELOG-byte-grep discovery channel Wires swim-37 trap-class §1 (parallel-evolution / cherry-false-negative) discovery channel onto runnable ground. Pure function, byte-pinned to studies/swim-37/traps/parallel-evolution-class.md §2 byte-walk against v2026.4.24 CHANGELOG L164 (instance `7ee46a3ab9 fix: Add runner label to /status (openclaw#70595)`). API: - grepChangelog(subject, content) — literal grep -F semantics, returns ALL hits (memo flags subject-line collisions as known false-positive mode; channel returns all so caller can decide) - extractPrNumberToken(subject) — pulls trailing `(#N)` PR-token, last match wins (commits sometimes cite earlier PRs in body but subject's trailing PR is the landing PR) - discoverChangelogHit(subject, content) — composite: PR-token first, fall through to full-subject; needle field tells caller which channel found the hit (or which was attempted last on miss) Wires trap §1 it.todo: swim-runner.test.ts L81: "CHANGELOG-byte-grep discovery channel emits drop-with-reason span" …onto runnable ground without depending on captureSwim() shim. The span-emission piece is downstream (callsite needs to import this and call emitContinuationDisabledSpan/equivalent with discovery.channel attribute) but the discovery primitive is now testable in isolation. Coverage (18 tests): - subject-path: full subject, collision (multi-hit), empty subject (defense-in-depth: empty must NOT match-everything), whitespace, empty changelog, literal grep -F (no regex interpretation) - PR-token extract: trailing #N, last-of-multiple, none, bare #N (must miss \u2014 only `(#N)` form), empty parens - composite: load-bearing PR-channel hit on openclaw#70595, fallback to subject for harness-divergence instances, NO-hit for harness-divergence (memo says CHANGELOG silent for these) - channel attribution: needle=PR-token vs needle=subject distinguishes channels for downstream span attrs Boundary: pure string-in, structured-out. No git, no fs, no network. Caller drives `git show <base>:CHANGELOG.md` and `git log -1 --format='%s' <commit>`. Discipline matches the helper- tier contract pattern from PR #406 (94fc8d1). Result: 18/18 passing. swim-37 vitest project: 52 passed | 18 todo (4). pnpm lint:core: 0/0. Closes part of #324; wires §1 it.todo placeholder. * feat(swim-37): #324 §1 cross-cut cherry-pick provenance parser Adds the SECOND §1 discovery channel from the trap-class memo cross-cut (L168 of studies/swim-37/traps/parallel-evolution-class.md). Where CHANGELOG-byte-grep catches the rebase-source's WORK already in base, cherry-pick-provenance catches the rebase-source's COMMIT already in base — by parsing the `(cherry picked from commit <sha>)` footer that git appends. The trap-class memo notes Instance 3 (`aa1908bf38`) carries this footer pointing to `9dd097a7a5...`, which would have been caught deterministically before conflict triage if the parser existed. API: - parseCherryPickProvenance(commitBody) — returns ALL footers in body order; multi-pick provenance accumulates when commits cross multiple branches; caller decides which is load-bearing - lastCherryPickSourceSha(commitBody) — convenience for the common case (most recent pick in chain) Boundary: pure parser. Caller drives `git show -s --format=%B` to get body and `git merge-base --is-ancestor <src> <base>` to make the ancestor decision. Coverage (17 tests): - byte-pinned aa1908b footer (40-hex) - multi-footer in body order (cross-branch picks) - trailing-period acceptance - short-SHA acceptance (7+ hex, git's unique-floor) - non-hex rejection, too-short rejection - anchored-line discipline (sentence with phrase ≠ footer) - leading-whitespace tolerance (indented footer blocks) - empty body, no-footer body - revert-footer NOT confused with pick-footer - evidence preservation (line field for downstream span attrs) Companion to chunk-1 of this PR (changelog-grep). Together they cover the two positive-signal channels from the §1 memo. The conflict-content classification rubric (third channel) is harder to cover as a pure function — needs file-path + diff inspection — so it's left for a later slice with the right shape. Result: 17/17 passing this file. swim-37 vitest project: 69 passed | 18 todo (5 files). pnpm lint:core: 0/0. Closes more of #324; second of two §1 discovery primitives. * feat(swim-37): #324 §1 entry-point classifier (composes both discovery primitives) Wires the §1 entry-point it.todo placeholder swim-runner.test.ts L80: "rebase bot classifies synthetic squash-rebased commit as DROP (not PICK)" onto runnable ground by composing the two §1 discovery primitives (changelog-grep + cherry-pick-provenance) into a single DROP/PICK/REVIEW verdict with channel attribution and structured evidence for downstream span emission. API: - classifyRebasePick({subject, commitBody, baseChangelog, isAncestorOf}) returns {verdict: DROP|PICK|REVIEW, channel, evidence, needsConflictContentInspection?} Channel ordering (memo-aligned, deterministic): 1. cherry-pick-provenance (highest precision: exact source SHA, ancestor-of-base check unambiguous) 2. changelog-grep:pr (high precision: PR tokens rarely collide) 3. changelog-grep:subject (medium precision: substring, can collide) Load-bearing invariant: when both positive-signal channels miss, the verdict is REVIEW with needsConflictContentInspection=true \u2014 NOT a silent default to PICK. The memo's third channel (conflict-content classification rubric) needs file-path + diff inspection, not pure string\u2192struct, so it's NOT covered here; classifier surfaces the gap explicitly so the caller can hand off rather than misclassify the test-harness divergence cases. Evidence completeness: even when one channel is load-bearing for the verdict, ALL channels' evidence is recorded in the result so the downstream span emission can carry full audit data. Non-ancestor cherry-pick footers go to a separate slot for cross-branch provenance audit. Boundary: pure function. Caller drives: - subject from `git log -1 --format='%s' <pick>` - body from `git show -s --format=%B <pick>` - base CHANGELOG from `git show <base>:CHANGELOG.md` - isAncestorOf wrapping `git merge-base --is-ancestor` Coverage (10 tests): - Instance 1 (openclaw#70595): DROP via changelog-grep:pr; no needsConflictContentInspection on DROP - Instance 3 (cherry-pick): DROP via cherry-pick-provenance; precedence over CHANGELOG when both fire; non-ancestor footer doesn't trigger DROP via that channel - Instance 2 (test-harness): REVIEW with needsConflictContentInspection=true; classifier never defaults to PICK on absence-of-signal - subject-channel: DROP via changelog-grep:subject - evidence completeness: both channels' evidence recorded when both fire; non-ancestor footers separated from load-bearing slot Result: 10/10 passing this file. swim-37 vitest project: 79 passed | 18 todo (6 files). pnpm lint:core: 0/0. Closes part of #324; third of three §1 todos wired to runnable ground. --------- Co-authored-by: Ronan 🌊 <ronan@solidor.io>
…405) * test(swim-37): wire captureSwim() for continue_work primitive Replaces the placeholder `declare function captureSwim` in `studies/swim-37/harness/swim-runner.test.ts` with a real implementation in `studies/swim-37/harness/swim-runner.ts`. The wired path drives `emitContinuationWorkSpan` against `createInMemorySpanRecorder()` and returns the captured spans + the synthesized `chainId` (uuid v7 via `generateChainId`). Flips two prior `it.todo` markers to live tests: - emits continuation.work span with chain.id stamped (#366) - span carries chain.step.remaining attribute Plus three new live tests: - sentinel: captureSwim() is wired for continue_work - refuses continue_delegate / heartbeat / lich primitives with a clear error so the spec's remaining `it.todo` markers stay honest about what is implemented - repeated calls don't leak capture state between invocations STDOUT-only discipline preserved: no BasicTracerProvider, no OTLP exporter, no @opentelemetry/sdk-trace-base machinery — capture flows through `setContinuationTracer(recorder.tracer)` as documented in README.md. Other primitives (`continue_delegate`, `heartbeat`, lich-shape) remain `it.todo` until the corresponding dispatch / heartbeat / compaction-release seams have a comparable single-helper entry point we can drive synthetically. Test results: - swim-37 vitest project: 20 passed, 16 todo, 0 failed - in-memory-span-recorder.test.ts unchanged (12/12 still pass) - tsgo errors visible elsewhere in tree are pre-existing (src/plugin-sdk/provider-tools.ts), not introduced by this change Refs #324 (swim-37 harness) — `it.todo` count drops from 17 \u2192 16, first concrete primitive driven through the live tracer registry. * docs(swim-37): continue_delegate wiring memo (memo-before-wire) Pre-PR design memo for the next swim-37 primitive after #405 (continue_work). Decides shape before the wire so the PR lands clean. Resolved by reading production code: Q1 — real setTimeout vs fake timers N/A for dispatch-accept (synchronous, pre-timer); banked for a separate continue_delegate_fire swim. Cohort sign-off requested on: Q2 — recipient fan-out shape: N spans sharing chain.id (matches chunk-3 cohort design pin + #355 Stage-2 budget semantics). Q3 — delegate.mode x delegate.delivery: 8-cell it.each matrix + omission-contract row. Banked for follow-up: Q4 — continue_delegate_fire as separate primitive, not sub-case; informs SwimPrimitive type-union shape. Standard applied (per Cael 2026-04-27 #1498505918, figs #1498505870): would skipping this memo cost a chunk's worth of rework? Yes — three open Qs would have changed file shape. Refs: - #405 (continue_work wired) - #324 (swim-37 harness) - docs/design/334-slice2-chunk5b-delegate-fire-memo.md (pattern) * test(swim-37): wire captureSwim() for continue_delegate primitive Implements the design from `docs/design/swim-37-continue-delegate-wiring-memo.md` (commit 3bb086c762, same branch). Wiring (swim-runner.ts): - New continue_delegate case in the switch. - Adds CaptureSwimOptions axes: recipients, delivery, delegateMode. - Drives emitContinuationDelegateSpan once per recipient with shared chain.id (per chunk-3 cohort design pin: N recipients = N spans sharing chain.id, NOT one span with a recipients list). - Validates recipients is a positive integer. Tests (swim-runner.test.ts): - 8-cell it.each matrix: delegate.mode (normal | silent | silent-wake | post-compaction) x delegate.delivery (immediate | timer). - Omission contract: delegate.mode attribute absent when caller passes undefined. - Fan-out: 3 recipients emit 3 spans with shared chain.id. - Validation: recipients=0 and recipients=1.5 reject. - Removes the prior 'refuses continue_delegate' test row (now wired). Two it.todo markers preserved for downstream concerns: - chain-budget decrement / ChainBudget.declineToCarry observable (needs ChainBudget integration, not in scope here). - fan-out budget arithmetic per #355 Stage-2 (1 step not N) \u2014 separate from span cardinality. Test results delta: Before: 8 passed | 16 todo (24 tests) After: 19 passed | 15 todo (34 tests) +11 live | -1 todo Refs: - #324 swim-37 harness - #405 captureSwim continue_work (parent of this branch) - docs/design/swim-37-continue-delegate-wiring-memo.md * test(swim-37): pin recipient.index gap on continue_delegate fan-out Per Cael's caution on the wiring memo (Discord msg 1498507232185286849 sign-off): N spans sharing chain.id risks collapsing into analytic mush at scale unless per-recipient distinction is visible in attrs. Currently emitContinuationDelegateSpan exposes no recipient.index axis \u2014 all N spans in a fan-out are byte-identical except spanId. Adds: - One live GAP-PIN test asserting the current (deficient) reality: fan-out spans toEqual on attributes. This will fail loudly when the production helper grows the axis, prompting a flip to not.toEqual. - One it.todo marker for the upcoming production-helper change. Production-helper change is out of scope for this PR; will file follow-up issue. The gap-pin lives in the integration tier so it shows up in any code-review touching delegate-dispatch span shape. Refs: - #324 swim-37 harness - #405 captureSwim continue_work + continue_delegate (this PR) - docs/design/swim-37-continue-delegate-wiring-memo.md (Q2) - Cael Discord sign-off msg 1498507232185286849 * test(swim-37): wire \u00a71 entry-point todo onto classifyRebasePick Now that #408 (PR #408 merged at cb73bc8) landed `rebase-classifier.ts` with the pure `classifyRebasePick` composer, the integration-tier \u00a71 entry-point it.todo on swim-runner.test.ts L80 is wireable without any new shim. Synthetic commit pinned: subject mentions PR openclaw#70595, base CHANGELOG contains the same PR token (squash-rebase shape from the memo's 7ee46a3 anchor instance). Discovery channel: changelog-grep:pr. L81 (CHANGELOG-byte-grep emits drop-with-reason span) stays todo \u2014 needs the production callsite that pairs the channel with `emitContinuationDisabledSpan`, per \ud83c\udf0a's #408 note. Test delta: Before: 20 passed | 16 todo (36) After: 21 passed | 15 todo (36) +1 live | -1 todo
* Add runner label to status output * Add changelog entry for status runner label * Fix status runner detection and sanitization
* Add runner label to status output * Add changelog entry for status runner label * Fix status runner detection and sanitization
Summary
This change adds an explicit
Runner:field to/statusso the status output tells you which execution path currently owns the session, not just which model is selected.Examples of what
/statuscan now show:How it works
The status formatter now resolves the runner label from the session in this order:
sessionEntry.acp),/statusshows the ACP harness agent and backend as<agent> (acp/<backend>).This is the exact case that distinguishes
codexfromgemini,claude, or any other ACP agent./statusshows<provider> (cli)./statusfalls back topi (embedded).That keeps the runner identity separate from the selected model line. The model still tells you what model/provider is being used, while
Runner:tells you which runtime family is actually executing the session.Verification
Focused verification passed with:
Note: the repo's default staged
vitesthook currently routes this changed test file into a config that excludessrc/auto-reply/**, which causes an unrelatedNo test files foundfailure. The targeted/statussuites above are green.