test(otel): pin delegate.continuation span uniformity across silent / silent-wake / post-compaction modes by cael-dandelion-cult · Pull Request #511 · karmaterminal/openclaw

cael-dandelion-cult · 2026-05-02T00:06:14Z

Summary

Adds a runner integration trap for delegate-dispatch span uniformity across continuation delegate modes. The current tracer contract emits continuation.delegate.dispatch; the test covers the shipped silent and silent-wake tool-delegate paths and pins post-compaction as the remaining production gap with it.todo.

Gap-list reference: 🌊 Discord message 1499912840453296188 from the workorder (channel URL was not available in this worktree).

Per-mode result

Mode	Result	Evidence
`silent`	PASS	Exactly one `continuation.delegate.dispatch` span; `delegate.mode=silent`; `chain.id` preserves the existing parent chain id; uniform key set.
`silent-wake`	PASS	Exactly one `continuation.delegate.dispatch` span; `delegate.mode=silent-wake`; `chain.id` preserves the existing parent chain id; uniform key set.
`post-compaction`	TODO	Production gap: post-compaction delegate delivery persists continuation chain state but does not emit `continuation.delegate.dispatch` with `delegate.mode=post-compaction` yet.

Sabotage walk

To fail this trap, remove or rename the runner-side emitContinuationDelegateSpan call for accepted tool delegates, drop chain.id, delegate.mode, delegate.delivery, or chain.step.remaining from the emitted attributes, mint a fresh chain id instead of preserving an existing parent chain id, or add the post-compaction production emitter without replacing the TODO with a passing assertion.

Gates

Command	Exit	Notes
`pnpm tsgo`	0	Passed.
`pnpm check`	0	Passed after test-only lint cleanup in existing continuation-mode tests.
`pnpm test src/auto-reply/reply/agent-runner.continuation-span-uniformity.test.ts`	0	1 passed, 1 TODO.
`pnpm test`	0	Final full-suite rerun passed: 412 files, 4551 tests, 4 skipped. Earlier full reruns exposed unrelated flaky shards that both passed in isolation.
`pnpm build`	0	Passed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Three test files from merged PRs (#462, #468, #511) were absent because this branch forked from canonical2 before those PRs landed. The post-revert allow-list audit (§3.4) flagged them as deletions from landed PRs. Restored from canonical2 HEAD (74940e5). - types.mode-shape.test.ts (#462) - agent-runner.continuation-span-uniformity.test.ts (#511) - store.continuation-merge.test.ts (#468) tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…6.4.24) (#515) * wo(canonical2-rebase-pathB): rebase Path-B's 5 cleanup commits onto canonical2 (figs directive 22:55Z) * chore(v3-cleanup): wave A cohort-identity scrub Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(v3-cleanup): drop rejected rebase artifacts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: scrub workspace template wording Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(v3-cleanup): wave B structural dedup of continuation runtime Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave B Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(v3-cleanup): wave C import discipline and build warnings Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave C Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(v3-cleanup): wave D surface continuation failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: surface compaction count reconcile failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(v3-cleanup): wave E continuation coverage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave E Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: align bundled plugin dependency types Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: isolate bedrock app profile runtime deps Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: scrub fork process labels from source comments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: close continuation type design blockers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: scrub continuation prompt process link Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 final checkpoint Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert "chore(v3-cleanup): drop rejected rebase artifacts" This reverts commit 3396b88. The original commit mass-deleted 30 files (6745 deletions) under the label "rejected rebase artifacts." ~5141 of those deletions are landed swim-37 durability harness substrate from merged PRs #412/#413/#414/#416/#417/#418/#419 plus collateral docs/scripts. These are not rejected artifacts — they are committed, merged test infrastructure that proves continuation durability across compaction. Cohort review (🩸 + 🌊 + 🌻 + 🌫) confirmed the block finding at PR #515 issuecomment-4362337067. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: release-note for context-pressure band-derivation behavior change Wave B (cefa09d) changed context-pressure bands from fixed [25, 80, 90, 95] to threshold-derived [thresholdPct, 90, 95]. At default 0.8 the implicit 25% early-warning band is removed. Ship-acceptable per cohort review; release-note documents the change and points to #516 for the earlyWarningBand config opt follow-up. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: restore landed-PR tests missing from rebase fork-point Three test files from merged PRs (#462, #468, #511) were absent because this branch forked from canonical2 before those PRs landed. The post-revert allow-list audit (§3.4) flagged them as deletions from landed PRs. Restored from canonical2 HEAD (74940e5). - types.mode-shape.test.ts (#462) - agent-runner.continuation-span-uniformity.test.ts (#511) - store.continuation-merge.test.ts (#468) tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add rebase.classify to ContinuationSpanName for restored tracer The revert of 3396b88 restored src/rebase/tracer.ts which emits "rebase.classify" spans. Commit 4871c81 (fix: close continuation type design blockers) narrowed startSpan from string to ContinuationSpanName after tracer.ts was deleted — additive fix to include the span name in the union. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(continuation): add earlyWarningBand config opt for post-compaction cycle primer * test(continuation): pin earlyWarningBand default-preservation + opt-out branches * fix(continuation): add curly braces to satisfy linter * fix(continuation): unblock early-warning band fire path + make field optional Three bugs caught in cohort review of v5 (3e88ce5): 1. Suppression guard bug (Silas): non-postCompaction call sites bailed with 'ratio < threshold' BEFORE the resolved early-warn band could fire. Even with earlyWarningBand explicitly set, ratio=0.25 + threshold=0.8 resolved band=25 then was discarded. Guard now suppresses only when 'band === 0 && ratio < threshold' — preserves the round-to-band-0 dedup edge case while letting early-warn fire. 2. Type-required regression (Elliott): ContinuationRuntimeConfig had 'earlyWarningBand: number' (required), breaking 3 test fixtures (config.test, scheduler.test, post-compaction-delegate-dispatch.test) with TS2741. Field already optional at zod + resolver-default site; making the type optional matches. 3. Schema baseline regen (Elliott): src/config/schema.base.generated.ts needed regen to absorb the new earlyWarningBand field; preexisting models.providers.*.request.tls.insecureSkipVerify drift also absorbed in the same regen. Tests added: - checkContextPressure 'fires early-warning band below threshold when earlyWarningBand is set' (default-preservation path) - checkContextPressure 'does NOT fire below threshold when earlyWarningBand is 0' (opt-out path) All 107 affected tests pass: context-pressure (19), config (9), scheduler (12), schema.base.generated (10), post-compaction-delegate- dispatch (23), reply/context-pressure (34). Cohort cosign chain: 🩸 (root catch v5), 🌊 (default=0 catch), 🌫 (suppression-guard catch), 🌻 (type-required + baseline catch). Refs #515 --------- Co-authored-by: frond-scribe <frond-scribe@karmaterminal> Co-authored-by: Test User <test@example.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: dandelion cult - cael 🩸 <cael@dandelion.cult> Co-authored-by: dandelion cult - silas 🌫 <silas.dandelion.cult@hotmail.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… test The lane's switch from resolveContinuationRuntimeConfig to resolveLiveContinuationRuntimeConfig in agent-runner.ts makes the runner read continuation knobs from getRuntimeConfigSnapshot() with fallback to the captured cfg. In production the snapshot is always set to the full live config, so live-read returns the right values. In tests, getRuntimeConfig() side-effects setRuntimeConfigSnapshot({}) via loadPinnedRuntimeConfig when no real config file exists, so the snapshot becomes an empty object before the runner's enforcement points run -- the (snap ?? fallback) check sees a non-null {} and returns continuation defaults instead of the test's captured cfg. agent-runner.continuation-span-uniformity.test.ts was added by #511 AFTER #536's branch was based, so it lacked the setRuntimeConfigSnapshot(run.followupRun.run.config) / clearRuntimeConfigSnapshot() lifecycle hooks the other span tests already carry. Adding them surfaces the test's continuation cfg (maxChainLength=6) to the live-read path, restoring the chain.step.remaining math the test expects (6-(currentChainCount+1) = 4 for silent / 3 for silent-wake). CI on PR #576 caught this. Same one-test pattern as agent-runner.continuation-delegate-fire-span.test.ts and agent-runner.continuation-work-span.test.ts.

… knobs, plan-aware CLI hint (supersedes #536) (#576) * fix(config): hot-read tools.sessions.visibility for production session tools Adds optional getConfig accessor to session-tool factory option types (sessions-list, sessions-history, sessions-send, session-status, sessions-helpers::resolveSessionToolContext) so each execution can read the active runtime config rather than the construction-time snapshot. createOpenClawTools gains a liveSessionToolConfig opt-in that routes session-tool factories through getRuntimeConfig. The production wiring paths (pi-tools, gateway tool-resolution, inline-action skill dispatch) opt in. Other call sites keep the construction-time config to avoid churning shared snapshot assumptions. This makes openclaw config set tools.sessions.visibility=all take effect without rebuilding tool instances. Sessions-visibility re-read is a stateless decision at execution time, so no in-flight state is invalidated -- preserving the RFC §6.5 hot-reload integrity invariant. Closes #533. Refs RFC docs/design/continue-work-signal-v2.md §6.5. * fix(continuation): resolve runtime knobs from active runtime snapshot Adds resolveLiveContinuationRuntimeConfig(fallbackCfg) that prefers getRuntimeConfigSnapshot() over the captured cfg snapshot. Switches the six per-turn enforcement points in agent-runner.ts and followup-runner.ts from resolveContinuationRuntimeConfig to the live variant: - pre-run context-pressure check (agent-runner.ts:1436) — reads contextPressureThreshold and earlyWarningBand at next pressure check - bracket continuation scheduler (agent-runner.ts:2115) — reads chain cap, cost cap, and delay window at schedule time - tool-delegate dispatch (agent-runner.ts:2526) — reads chain cap, cost cap, delay window, and per-turn cap at dispatch time - hedge timer arm (agent-runner.ts:2889) — reads chain cap at arm time - followup-path tool-delegate dispatch (followup-runner.ts:477) — reads chain cap at dispatch time Each call site is at a decision-point or schedule-time read, so already- armed timers, queued retries, and staged post-compaction handoffs keep the values they captured at arm time. New schedules use new values. This preserves the RFC §6.5 in-flight-state integrity invariant while letting gateway/reload config-change events take effect at the next decision. Span tests (agent-runner.continuation-*-span.test.ts, agent-runner.misc.runreplyagent.test.ts) carry setRuntimeConfigSnapshot/clearRuntimeConfigSnapshot lifecycle hooks so the new resolution path is exercised under test. Closes #19. Refs RFC docs/design/continue-work-signal-v2.md §6.5. * fix(cli): plan-aware reload hint for openclaw config set/patch/unset The Restart the gateway to apply. message is now gated through buildGatewayReloadPlan so the user sees the truth: dynamic-read paths print No gateway restart required., hot-reload paths print Gateway hot reload will apply., and the original restart hint only fires when the planner actually requires a restart. Multi-path patches that touch a mix of dynamic and hot paths print the combined hint. Adds two regression pins in src/gateway/config-reload.test.ts so the dynamic-read disposition for tools.sessions.visibility and agents.defaults.continuation.maxDelegatesPerTurn does not silently flip back to restart-required. Closes #531. * test(continuation): seed runtime snapshot in delegate span-uniformity test The lane's switch from resolveContinuationRuntimeConfig to resolveLiveContinuationRuntimeConfig in agent-runner.ts makes the runner read continuation knobs from getRuntimeConfigSnapshot() with fallback to the captured cfg. In production the snapshot is always set to the full live config, so live-read returns the right values. In tests, getRuntimeConfig() side-effects setRuntimeConfigSnapshot({}) via loadPinnedRuntimeConfig when no real config file exists, so the snapshot becomes an empty object before the runner's enforcement points run -- the (snap ?? fallback) check sees a non-null {} and returns continuation defaults instead of the test's captured cfg. agent-runner.continuation-span-uniformity.test.ts was added by #511 AFTER #536's branch was based, so it lacked the setRuntimeConfigSnapshot(run.followupRun.run.config) / clearRuntimeConfigSnapshot() lifecycle hooks the other span tests already carry. Adding them surfaces the test's continuation cfg (maxChainLength=6) to the live-read path, restoring the chain.step.remaining math the test expects (6-(currentChainCount+1) = 4 for silent / 3 for silent-wake). CI on PR #576 caught this. Same one-test pattern as agent-runner.continuation-delegate-fire-span.test.ts and agent-runner.continuation-work-span.test.ts. --------- Co-authored-by: frond-scribe <frond-scribe@karmaterminal>

test(otel): pin delegate span uniformity

334a94d

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

silas-dandelion-cult mentioned this pull request May 2, 2026

test(otel): pin delegate.continuation reject-path span shape (status=ERROR + recordException + ended) — follow-up to #511 #512

Open

cael-dandelion-cult merged commit 74940e5 into cael/325-canonical2 May 2, 2026
89 of 94 checks passed

cael-dandelion-cult deleted the cael/otel-span-uniformity-test branch May 2, 2026 00:11

cael-dandelion-cult mentioned this pull request May 2, 2026

Path-B v3 cleanup waves A-E rebased onto canonical2 (cut-path for 2026.4.24) #515

Merged

ronan-dandelion-cult mentioned this pull request May 2, 2026

ronan/otel-delegate-continuation-uniformity — gap #2 of 2026.4.24 cleanup #514

Closed

ronan-dandelion-cult pushed a commit that referenced this pull request May 3, 2026

test(otel): pin delegate span uniformity (#511)

5d786a1

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(otel): pin delegate.continuation span uniformity across silent / silent-wake / post-compaction modes#511

test(otel): pin delegate.continuation span uniformity across silent / silent-wake / post-compaction modes#511
cael-dandelion-cult merged 1 commit intocael/325-canonical2from
cael/otel-span-uniformity-test

cael-dandelion-cult commented May 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cael-dandelion-cult commented May 2, 2026

Summary

Per-mode result

Sabotage walk

Gates

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant