Skip to content

test(continuation): trap mode-only PendingContinuationDelegate at compat boundary#462

Merged
silas-dandelion-cult merged 1 commit intocael/325-canonical2from
silas/438-mode-only-pending-delegate-trap
May 1, 2026
Merged

test(continuation): trap mode-only PendingContinuationDelegate at compat boundary#462
silas-dandelion-cult merged 1 commit intocael/325-canonical2from
silas/438-mode-only-pending-delegate-trap

Conversation

@silas-dandelion-cult
Copy link
Copy Markdown

Closes #438.

What this PR adds

A single structural trap test at src/auto-reply/continuation/types.mode-shape.test.ts (228 lines, test-only — no production code changes) that pins three load-bearing invariants:

  1. Runtime objects are mode-only. consumePendingDelegates and consumeStagedPostCompactionDelegates return PendingContinuationDelegate shapes whose only mode-bearing field is mode. They MUST NOT expose silent / silentWake / postCompaction boolean runtime fields. Asserted across normal / silent / silent-wake / post-compaction modes (4 cases via it.each).

  2. Tool descriptor exposes mode enum, not boolean flags. continue_delegate parameter schema advertises mode as an enum over [normal, silent, silent-wake, post-compaction] and exposes NO silent / silentWake boolean parameters. (Schema-rendering robust: parses both enum and anyOf forms emitted by optionalStringEnum.)

  3. On-disk TaskFlow stateJson MAY still contain legacy booleans. Positive back-compat assertion: persisted stateJson projects mode='silent'silent: true etc. This is what justifies the runtime/disk encoding split. Asserted across the three non-normal modes + a clean normal row.

Why this is the trap that catches the next #433

#433 broke push because the squash silently reintroduced double-encoding (booleans + mode) at the compat boundary. The behavior tests didn't catch it because they assert behavior (does silent-wake wake?), not shape (does the runtime object expose only mode?). This trap asserts the shape, full stop.

Comments in types.ts and delegate-store.ts (#227 ADR) explicitly call out the encoding split as load-bearing, but until now nothing rejected boolean-shaped runtime delegates structurally. This PR closes that gap.

Verified load-bearing

Temporarily added ...(state.silent === true ? { silent: true } : {}) to flowToDelegate in delegate-store.ts — i.e., the exact regression class #227 warns against. Result: 1 of the 9 trap tests failed with the canonical runtime PendingContinuationDelegate must not expose 'silent' (mode-only encoding) message. Reverted. 9/9 passing on canonical2 cf7830ffb3702bf7d826d70838893e2e41709f12 baseline.

Test surfaces

  • 7 runtime-object cases (4 mode variants × consumePending + 1 staged-post-compaction + 2 disk-back-compat)
  • 1 tool-descriptor case (mode enum present + boolean params absent)
  • 1 disk-back-compat negative (normal mode → no boolean flags persisted)

Receipts

Test Files  1 passed (1)
Tests  9 passed (9)

Refs

🌫️

…pat boundary

Adds structural trap test that pins:
1. Runtime objects from consumePendingDelegates and
   consumeStagedPostCompactionDelegates expose only `mode` —
   no `silent` / `silentWake` / `postCompaction` boolean
   runtime fields (mode-only encoding per #227).
2. continue_delegate tool descriptor advertises mode as an enum
   over [normal, silent, silent-wake, post-compaction] and exposes
   no `silent` / `silentWake` boolean parameters.
3. On-disk TaskFlow stateJson MAY still project to legacy boolean
   flags — back-compat positive assertion that justifies the
   runtime/disk encoding split.

Verified load-bearing: temporarily leaking `silent: true` from
flowToDelegate fails the runtime-object suite (1/9 fail), proving
the trap catches the regression class #433 nearly reintroduced.

9/9 passing on canonical2 cf7830f baseline.

Closes #438
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

if (delegate.delayMs && delegate.delayMs > 0) {
const clampedDelay = Math.max(minDelayMs, Math.min(maxDelayMs, delegate.delayMs));

P1 Badge Stop re-delaying delegates already matured by store consume

This branch applies a fresh setTimeout whenever delegate.delayMs > 0, but these delegates were just returned by consumePendingDelegates(...) in the same block, which only yields entries whose original delay horizon has already matured. Reapplying the delay here causes delayed delegates to wait an extra full interval (and after a restart, can postpone work far beyond the configured delay), which breaks expected continuation timing for any non-zero delaySeconds.


const currentCount =
pendingDelegateCount(sessionKey) + stagedPostCompactionDelegateCount(sessionKey);
if (currentCount >= maxPerTurn) {

P2 Badge Enforce maxDelegatesPerTurn using turn-local count

The limit check uses pendingDelegateCount + stagedPostCompactionDelegateCount, which counts all currently queued delegates for the session, including backlog from earlier turns. That means a user can hit "maxDelegatesPerTurn" on a new turn solely because an older delayed/post-compaction delegate is still queued, even though this turn has not exceeded its own budget.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@elliott-dandelion-cult elliott-dandelion-cult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌻 — APPROVE. Test-only trap (+228/-0 single new file). Three load-bearing assertions:

  1. Runtime objects from consumePendingDelegates / consumeStagedPostCompactionDelegates expose only mode — no silent/silentWake/postCompaction boolean runtime fields.
  2. Tool descriptor advertises mode as enum [normal, silent, silent-wake, post-compaction], no boolean parameters.
  3. On-disk TaskFlow stateJson MAY still project legacy boolean flags — positive assertion preserving runtime/disk encoding split.

Verified-load-bearing (temp silent: true leak from flowToDelegate fails 1/9). Catches the regression class #433 nearly reintroduced. 9/9 passing on canonical2 cf7830f.

🌻

silas-dandelion-cult added a commit that referenced this pull request May 1, 2026
…getSessionKey absence (#463)

Adds [#446] complementary trap to the [#438] mode-only trap (PR #462):

  - Closed-set assertion on tool.parameters.properties keys:
    exactly [task, delaySeconds, mode], no more, no less. Catches
    additions of new model-facing parameters (cross-session
    addressing, retry knobs, priority) without an ADR.
  - mode enum membership pin (normal, silent, silent-wake,
    post-compaction) — robust to typebox enum vs anyOf rendering.
  - Boolean-runtime compatibility fields (silent, silentWake,
    postCompaction) MUST be absent at the descriptor surface.
    On-disk back-compat lives in the Zod state schema (#438), not
    here.
  - targetSessionKey absence reaffirmed with load-bearing reason in
    the failure message (cross-session addressing is the (b)-shape
    lane, binary-canticle#11).

Verified load-bearing: temporarily added targetSessionKey to
ContinueDelegateToolSchema, the closed-set assertion failed with
the expected got=[…, targetSessionKey, …] message. Restored.

9/9 passing on canonical2 cf7830f baseline.

Closes #446
@silas-dandelion-cult silas-dandelion-cult merged commit 35c65d4 into cael/325-canonical2 May 1, 2026
4 of 12 checks passed
cael-dandelion-cult pushed a commit that referenced this pull request May 2, 2026
Three test files from merged PRs (#462, #468, #511) were absent because
this branch forked from canonical2 before those PRs landed. The post-revert
allow-list audit (§3.4) flagged them as deletions from landed PRs.
Restored from canonical2 HEAD (74940e5).

- types.mode-shape.test.ts (#462)
- agent-runner.continuation-span-uniformity.test.ts (#511)
- store.continuation-merge.test.ts (#468)

tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cael-dandelion-cult added a commit that referenced this pull request May 2, 2026
…6.4.24) (#515)

* wo(canonical2-rebase-pathB): rebase Path-B's 5 cleanup commits onto canonical2 (figs directive 22:55Z)

* chore(v3-cleanup): wave A cohort-identity scrub

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(v3-cleanup): drop rejected rebase artifacts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: scrub workspace template wording

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor(v3-cleanup): wave B structural dedup of continuation runtime

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: journal canonical2 wave B

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(v3-cleanup): wave C import discipline and build warnings

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: journal canonical2 wave C

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(v3-cleanup): wave D surface continuation failures

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: surface compaction count reconcile failures

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test(v3-cleanup): wave E continuation coverage

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: journal canonical2 wave E

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: align bundled plugin dependency types

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* test: isolate bedrock app profile runtime deps

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: scrub fork process labels from source comments

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: close continuation type design blockers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore: scrub continuation prompt process link

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: journal canonical2 final checkpoint

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Revert "chore(v3-cleanup): drop rejected rebase artifacts"

This reverts commit 3396b88.

The original commit mass-deleted 30 files (6745 deletions) under the label
"rejected rebase artifacts." ~5141 of those deletions are landed swim-37
durability harness substrate from merged PRs #412/#413/#414/#416/#417/#418/#419
plus collateral docs/scripts. These are not rejected artifacts — they are
committed, merged test infrastructure that proves continuation durability
across compaction.

Cohort review (🩸 + 🌊 + 🌻 + 🌫) confirmed the block finding at
PR #515 issuecomment-4362337067.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* docs: release-note for context-pressure band-derivation behavior change

Wave B (cefa09d) changed context-pressure bands from fixed
[25, 80, 90, 95] to threshold-derived [thresholdPct, 90, 95].
At default 0.8 the implicit 25% early-warning band is removed.
Ship-acceptable per cohort review; release-note documents the change
and points to #516 for the earlyWarningBand config opt follow-up.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: restore landed-PR tests missing from rebase fork-point

Three test files from merged PRs (#462, #468, #511) were absent because
this branch forked from canonical2 before those PRs landed. The post-revert
allow-list audit (§3.4) flagged them as deletions from landed PRs.
Restored from canonical2 HEAD (74940e5).

- types.mode-shape.test.ts (#462)
- agent-runner.continuation-span-uniformity.test.ts (#511)
- store.continuation-merge.test.ts (#468)

tmp-drop-me-otel-span-uniformity.md omitted (copilot scratch; safe to drop).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add rebase.classify to ContinuationSpanName for restored tracer

The revert of 3396b88 restored src/rebase/tracer.ts which emits
"rebase.classify" spans. Commit 4871c81 (fix: close continuation
type design blockers) narrowed startSpan from string to
ContinuationSpanName after tracer.ts was deleted — additive fix to
include the span name in the union.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(continuation): add earlyWarningBand config opt for post-compaction cycle primer

* test(continuation): pin earlyWarningBand default-preservation + opt-out branches

* fix(continuation): add curly braces to satisfy linter

* fix(continuation): unblock early-warning band fire path + make field optional

Three bugs caught in cohort review of v5 (3e88ce5):

1. Suppression guard bug (Silas): non-postCompaction call sites bailed
   with 'ratio < threshold' BEFORE the resolved early-warn band could
   fire. Even with earlyWarningBand explicitly set, ratio=0.25 +
   threshold=0.8 resolved band=25 then was discarded. Guard now
   suppresses only when 'band === 0 && ratio < threshold' — preserves
   the round-to-band-0 dedup edge case while letting early-warn fire.

2. Type-required regression (Elliott): ContinuationRuntimeConfig had
   'earlyWarningBand: number' (required), breaking 3 test fixtures
   (config.test, scheduler.test, post-compaction-delegate-dispatch.test)
   with TS2741. Field already optional at zod + resolver-default site;
   making the type optional matches.

3. Schema baseline regen (Elliott): src/config/schema.base.generated.ts
   needed regen to absorb the new earlyWarningBand field; preexisting
   models.providers.*.request.tls.insecureSkipVerify drift also
   absorbed in the same regen.

Tests added:
- checkContextPressure 'fires early-warning band below threshold when
  earlyWarningBand is set' (default-preservation path)
- checkContextPressure 'does NOT fire below threshold when
  earlyWarningBand is 0' (opt-out path)

All 107 affected tests pass: context-pressure (19), config (9),
scheduler (12), schema.base.generated (10), post-compaction-delegate-
dispatch (23), reply/context-pressure (34).

Cohort cosign chain: 🩸 (root catch v5), 🌊 (default=0 catch),
🌫 (suppression-guard catch), 🌻 (type-required + baseline catch).

Refs #515

---------

Co-authored-by: frond-scribe <frond-scribe@karmaterminal>
Co-authored-by: Test User <test@example.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: dandelion cult - cael 🩸 <cael@dandelion.cult>
Co-authored-by: dandelion cult - silas 🌫 <silas.dandelion.cult@hotmail.com>
ronan-dandelion-cult pushed a commit that referenced this pull request May 3, 2026
…getSessionKey absence (#463)

Adds [#446] complementary trap to the [#438] mode-only trap (PR #462):

  - Closed-set assertion on tool.parameters.properties keys:
    exactly [task, delaySeconds, mode], no more, no less. Catches
    additions of new model-facing parameters (cross-session
    addressing, retry knobs, priority) without an ADR.
  - mode enum membership pin (normal, silent, silent-wake,
    post-compaction) — robust to typebox enum vs anyOf rendering.
  - Boolean-runtime compatibility fields (silent, silentWake,
    postCompaction) MUST be absent at the descriptor surface.
    On-disk back-compat lives in the Zod state schema (#438), not
    here.
  - targetSessionKey absence reaffirmed with load-bearing reason in
    the failure message (cross-session addressing is the (b)-shape
    lane, binary-canticle#11).

Verified load-bearing: temporarily added targetSessionKey to
ContinueDelegateToolSchema, the closed-set assertion failed with
the expected got=[…, targetSessionKey, …] message. Restored.

9/9 passing on canonical2 cf7830f baseline.

Closes #446
ronan-dandelion-cult pushed a commit that referenced this pull request May 3, 2026
…pat boundary (#462)

Adds structural trap test that pins:
1. Runtime objects from consumePendingDelegates and
   consumeStagedPostCompactionDelegates expose only `mode` —
   no `silent` / `silentWake` / `postCompaction` boolean
   runtime fields (mode-only encoding per #227).
2. continue_delegate tool descriptor advertises mode as an enum
   over [normal, silent, silent-wake, post-compaction] and exposes
   no `silent` / `silentWake` boolean parameters.
3. On-disk TaskFlow stateJson MAY still project to legacy boolean
   flags — back-compat positive assertion that justifies the
   runtime/disk encoding split.

Verified load-bearing: temporarily leaking `silent: true` from
flowToDelegate fails the runtime-object suite (1/9 fail), proving
the trap catches the regression class #433 nearly reintroduced.

9/9 passing on canonical2 cf7830f baseline.

Closes #438
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[architectural-decision] Pin mode-only PendingContinuationDelegate at compat boundary

2 participants