test(otel): pin delegate.continuation span uniformity across silent / silent-wake / post-compaction modes#511
Merged
cael-dandelion-cult merged 1 commit intocael/325-canonical2from May 2, 2026
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
74940e5
into
cael/325-canonical2
89 of 94 checks passed
cael-dandelion-cult
pushed a commit
that referenced
this pull request
May 2, 2026
Three test files from merged PRs (#462, #468, #511) were absent because this branch forked from canonical2 before those PRs landed. The post-revert allow-list audit (§3.4) flagged them as deletions from landed PRs. Restored from canonical2 HEAD (74940e5). - types.mode-shape.test.ts (#462) - agent-runner.continuation-span-uniformity.test.ts (#511) - store.continuation-merge.test.ts (#468) tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cael-dandelion-cult
added a commit
that referenced
this pull request
May 2, 2026
…6.4.24) (#515) * wo(canonical2-rebase-pathB): rebase Path-B's 5 cleanup commits onto canonical2 (figs directive 22:55Z) * chore(v3-cleanup): wave A cohort-identity scrub Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(v3-cleanup): drop rejected rebase artifacts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: scrub workspace template wording Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(v3-cleanup): wave B structural dedup of continuation runtime Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave B Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(v3-cleanup): wave C import discipline and build warnings Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave C Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(v3-cleanup): wave D surface continuation failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: surface compaction count reconcile failures Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(v3-cleanup): wave E continuation coverage Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 wave E Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: align bundled plugin dependency types Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test: isolate bedrock app profile runtime deps Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: scrub fork process labels from source comments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: close continuation type design blockers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore: scrub continuation prompt process link Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: journal canonical2 final checkpoint Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Revert "chore(v3-cleanup): drop rejected rebase artifacts" This reverts commit 3396b88. The original commit mass-deleted 30 files (6745 deletions) under the label "rejected rebase artifacts." ~5141 of those deletions are landed swim-37 durability harness substrate from merged PRs #412/#413/#414/#416/#417/#418/#419 plus collateral docs/scripts. These are not rejected artifacts — they are committed, merged test infrastructure that proves continuation durability across compaction. Cohort review (🩸 + 🌊 + 🌻 + 🌫) confirmed the block finding at PR #515 issuecomment-4362337067. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: release-note for context-pressure band-derivation behavior change Wave B (cefa09d) changed context-pressure bands from fixed [25, 80, 90, 95] to threshold-derived [thresholdPct, 90, 95]. At default 0.8 the implicit 25% early-warning band is removed. Ship-acceptable per cohort review; release-note documents the change and points to #516 for the earlyWarningBand config opt follow-up. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: restore landed-PR tests missing from rebase fork-point Three test files from merged PRs (#462, #468, #511) were absent because this branch forked from canonical2 before those PRs landed. The post-revert allow-list audit (§3.4) flagged them as deletions from landed PRs. Restored from canonical2 HEAD (74940e5). - types.mode-shape.test.ts (#462) - agent-runner.continuation-span-uniformity.test.ts (#511) - store.continuation-merge.test.ts (#468) tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add rebase.classify to ContinuationSpanName for restored tracer The revert of 3396b88 restored src/rebase/tracer.ts which emits "rebase.classify" spans. Commit 4871c81 (fix: close continuation type design blockers) narrowed startSpan from string to ContinuationSpanName after tracer.ts was deleted — additive fix to include the span name in the union. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(continuation): add earlyWarningBand config opt for post-compaction cycle primer * test(continuation): pin earlyWarningBand default-preservation + opt-out branches * fix(continuation): add curly braces to satisfy linter * fix(continuation): unblock early-warning band fire path + make field optional Three bugs caught in cohort review of v5 (3e88ce5): 1. Suppression guard bug (Silas): non-postCompaction call sites bailed with 'ratio < threshold' BEFORE the resolved early-warn band could fire. Even with earlyWarningBand explicitly set, ratio=0.25 + threshold=0.8 resolved band=25 then was discarded. Guard now suppresses only when 'band === 0 && ratio < threshold' — preserves the round-to-band-0 dedup edge case while letting early-warn fire. 2. Type-required regression (Elliott): ContinuationRuntimeConfig had 'earlyWarningBand: number' (required), breaking 3 test fixtures (config.test, scheduler.test, post-compaction-delegate-dispatch.test) with TS2741. Field already optional at zod + resolver-default site; making the type optional matches. 3. Schema baseline regen (Elliott): src/config/schema.base.generated.ts needed regen to absorb the new earlyWarningBand field; preexisting models.providers.*.request.tls.insecureSkipVerify drift also absorbed in the same regen. Tests added: - checkContextPressure 'fires early-warning band below threshold when earlyWarningBand is set' (default-preservation path) - checkContextPressure 'does NOT fire below threshold when earlyWarningBand is 0' (opt-out path) All 107 affected tests pass: context-pressure (19), config (9), scheduler (12), schema.base.generated (10), post-compaction-delegate- dispatch (23), reply/context-pressure (34). Cohort cosign chain: 🩸 (root catch v5), 🌊 (default=0 catch), 🌫 (suppression-guard catch), 🌻 (type-required + baseline catch). Refs #515 --------- Co-authored-by: frond-scribe <frond-scribe@karmaterminal> Co-authored-by: Test User <test@example.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: dandelion cult - cael 🩸 <cael@dandelion.cult> Co-authored-by: dandelion cult - silas 🌫 <silas.dandelion.cult@hotmail.com>
ronan-dandelion-cult
pushed a commit
that referenced
this pull request
May 3, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ronan-dandelion-cult
pushed a commit
that referenced
this pull request
May 4, 2026
… test
The lane's switch from resolveContinuationRuntimeConfig to
resolveLiveContinuationRuntimeConfig in agent-runner.ts makes the
runner read continuation knobs from getRuntimeConfigSnapshot() with
fallback to the captured cfg. In production the snapshot is always set
to the full live config, so live-read returns the right values. In
tests, getRuntimeConfig() side-effects setRuntimeConfigSnapshot({})
via loadPinnedRuntimeConfig when no real config file exists, so the
snapshot becomes an empty object before the runner's enforcement
points run -- the (snap ?? fallback) check sees a non-null {} and
returns continuation defaults instead of the test's captured cfg.
agent-runner.continuation-span-uniformity.test.ts was added by #511
AFTER #536's branch was based, so it lacked the
setRuntimeConfigSnapshot(run.followupRun.run.config) /
clearRuntimeConfigSnapshot() lifecycle hooks the other span tests
already carry. Adding them surfaces the test's continuation cfg
(maxChainLength=6) to the live-read path, restoring the chain.step.remaining
math the test expects (6-(currentChainCount+1) = 4 for silent / 3
for silent-wake).
CI on PR #576 caught this. Same one-test pattern as
agent-runner.continuation-delegate-fire-span.test.ts and
agent-runner.continuation-work-span.test.ts.
ronan-dandelion-cult
added a commit
that referenced
this pull request
May 4, 2026
… knobs, plan-aware CLI hint (supersedes #536) (#576) * fix(config): hot-read tools.sessions.visibility for production session tools Adds optional getConfig accessor to session-tool factory option types (sessions-list, sessions-history, sessions-send, session-status, sessions-helpers::resolveSessionToolContext) so each execution can read the active runtime config rather than the construction-time snapshot. createOpenClawTools gains a liveSessionToolConfig opt-in that routes session-tool factories through getRuntimeConfig. The production wiring paths (pi-tools, gateway tool-resolution, inline-action skill dispatch) opt in. Other call sites keep the construction-time config to avoid churning shared snapshot assumptions. This makes openclaw config set tools.sessions.visibility=all take effect without rebuilding tool instances. Sessions-visibility re-read is a stateless decision at execution time, so no in-flight state is invalidated -- preserving the RFC §6.5 hot-reload integrity invariant. Closes #533. Refs RFC docs/design/continue-work-signal-v2.md §6.5. * fix(continuation): resolve runtime knobs from active runtime snapshot Adds resolveLiveContinuationRuntimeConfig(fallbackCfg) that prefers getRuntimeConfigSnapshot() over the captured cfg snapshot. Switches the six per-turn enforcement points in agent-runner.ts and followup-runner.ts from resolveContinuationRuntimeConfig to the live variant: - pre-run context-pressure check (agent-runner.ts:1436) — reads contextPressureThreshold and earlyWarningBand at next pressure check - bracket continuation scheduler (agent-runner.ts:2115) — reads chain cap, cost cap, and delay window at schedule time - tool-delegate dispatch (agent-runner.ts:2526) — reads chain cap, cost cap, delay window, and per-turn cap at dispatch time - hedge timer arm (agent-runner.ts:2889) — reads chain cap at arm time - followup-path tool-delegate dispatch (followup-runner.ts:477) — reads chain cap at dispatch time Each call site is at a decision-point or schedule-time read, so already- armed timers, queued retries, and staged post-compaction handoffs keep the values they captured at arm time. New schedules use new values. This preserves the RFC §6.5 in-flight-state integrity invariant while letting gateway/reload config-change events take effect at the next decision. Span tests (agent-runner.continuation-*-span.test.ts, agent-runner.misc.runreplyagent.test.ts) carry setRuntimeConfigSnapshot/clearRuntimeConfigSnapshot lifecycle hooks so the new resolution path is exercised under test. Closes #19. Refs RFC docs/design/continue-work-signal-v2.md §6.5. * fix(cli): plan-aware reload hint for openclaw config set/patch/unset The Restart the gateway to apply. message is now gated through buildGatewayReloadPlan so the user sees the truth: dynamic-read paths print No gateway restart required., hot-reload paths print Gateway hot reload will apply., and the original restart hint only fires when the planner actually requires a restart. Multi-path patches that touch a mix of dynamic and hot paths print the combined hint. Adds two regression pins in src/gateway/config-reload.test.ts so the dynamic-read disposition for tools.sessions.visibility and agents.defaults.continuation.maxDelegatesPerTurn does not silently flip back to restart-required. Closes #531. * test(continuation): seed runtime snapshot in delegate span-uniformity test The lane's switch from resolveContinuationRuntimeConfig to resolveLiveContinuationRuntimeConfig in agent-runner.ts makes the runner read continuation knobs from getRuntimeConfigSnapshot() with fallback to the captured cfg. In production the snapshot is always set to the full live config, so live-read returns the right values. In tests, getRuntimeConfig() side-effects setRuntimeConfigSnapshot({}) via loadPinnedRuntimeConfig when no real config file exists, so the snapshot becomes an empty object before the runner's enforcement points run -- the (snap ?? fallback) check sees a non-null {} and returns continuation defaults instead of the test's captured cfg. agent-runner.continuation-span-uniformity.test.ts was added by #511 AFTER #536's branch was based, so it lacked the setRuntimeConfigSnapshot(run.followupRun.run.config) / clearRuntimeConfigSnapshot() lifecycle hooks the other span tests already carry. Adding them surfaces the test's continuation cfg (maxChainLength=6) to the live-read path, restoring the chain.step.remaining math the test expects (6-(currentChainCount+1) = 4 for silent / 3 for silent-wake). CI on PR #576 caught this. Same one-test pattern as agent-runner.continuation-delegate-fire-span.test.ts and agent-runner.continuation-work-span.test.ts. --------- Co-authored-by: frond-scribe <frond-scribe@karmaterminal>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a runner integration trap for delegate-dispatch span uniformity across continuation delegate modes. The current tracer contract emits
continuation.delegate.dispatch; the test covers the shipped silent and silent-wake tool-delegate paths and pins post-compaction as the remaining production gap withit.todo.Gap-list reference: 🌊 Discord message
1499912840453296188from the workorder (channel URL was not available in this worktree).Per-mode result
silentcontinuation.delegate.dispatchspan;delegate.mode=silent;chain.idpreserves the existing parent chain id; uniform key set.silent-wakecontinuation.delegate.dispatchspan;delegate.mode=silent-wake;chain.idpreserves the existing parent chain id; uniform key set.post-compactioncontinuation.delegate.dispatchwithdelegate.mode=post-compactionyet.Sabotage walk
To fail this trap, remove or rename the runner-side
emitContinuationDelegateSpancall for accepted tool delegates, dropchain.id,delegate.mode,delegate.delivery, orchain.step.remainingfrom the emitted attributes, mint a fresh chain id instead of preserving an existing parent chain id, or add the post-compaction production emitter without replacing the TODO with a passing assertion.Gates
pnpm tsgopnpm checkpnpm test src/auto-reply/reply/agent-runner.continuation-span-uniformity.test.tspnpm testpnpm build