feat: Codex OAuth profile + interceptCompaction — accordion cadence for autonomous loops#665
Open
100yenadmin wants to merge 11 commits into
Open
Conversation
0234de2 to
685fe38
Compare
0295b05 to
5c4f9dd
Compare
Establishes the configuration contract for the upcoming "accordion"
compaction cadence (90% trigger → 35% target → repeat) that pairs
with the openclaw session_before_compact intercept hook (separate PR).
Two new fields on LcmConfig:
compactionTargetFraction?: number
Post-compaction target as a fraction of token budget. When LCM
completes a compaction sweep, it continues until tokens drop to
`compactionTargetFraction * tokenBudget`. Default undefined
preserves legacy v4.1 behavior (sweep stops at contextThreshold).
codexOAuthProfile?: "auto" | "off"
Selector for the Codex OAuth defaults tier. When "auto" (default)
AND the resolved provider for the current plugin instance is
openai-codex authenticated via OAuth, `CODEX_OAUTH_DEFAULTS` is
applied as an intermediate tier in resolveLcmConfigWithDiagnostics:
env > pluginConfig > oauthDefaults > hardcoded defaults
Operator overrides always win.
CODEX_OAUTH_DEFAULTS values (justified inline in config.ts):
contextThreshold = 0.90 // matches Codex's auto-compact
trigger from codex-rs source
compactionTargetFraction = 0.35 // lossless-claw heuristic, NOT
a codex-parity claim
respectThresholdAsHardFloor = true // never bleed below the floor
proactiveThresholdCompactionMode // "accordion" cadence only makes
= "deferred" sense in deferred mode
Resolver signature gains an optional `oauthProfileActive: boolean`
parameter (default false) passed by the plugin's register() function
after detecting OAuth via api.runtime.modelAuth.
LcmConfigDiagnostics gains `codexOAuthProfileApplied: boolean` so tests
and operators can verify whether the OAuth tier shaped the resolved
config.
Tests:
- 8 new OAuth-profile tests in test/config.test.ts covering:
legacy behavior (oauthProfileActive=false),
OAuth defaults applied (oauthProfileActive=true + auto),
explicit "off" disables OAuth tier,
env overrides OAuth defaults,
pluginConfig overrides OAuth defaults,
compactionTargetFraction default + plugin-config set,
codexOAuthProfile default + invalid-value fallback.
- 1 diagnostic-shape test updated to include the new field.
- 1615/1615 tests pass; zero new typecheck errors.
Backward compat:
- All new fields default to legacy v4.1 behavior when
oauthProfileActive is false (the default).
- codexOAuthProfile is OPTIONAL on LcmConfig so existing test
fixtures don't need updating.
PR follow-up to Martian-Engineering#619. Adds the lower-level plumbing for the "accordion" compaction cadence (90% trigger → 35% target → repeat) that pairs with the openclaw session_before_compact intercept hook. Three layers changed: 1. CompactionEngine.compact / compactFullSweep - New optional `stopAtTokens` param on both. When set, both Phase 1 (leaf) and Phase 2 (condensed) loops break the moment running tokens drop to or below this value, regardless of force/threshold. Lets the accordion target stop AT the operator's floor instead of overshooting to the freshTailMaxTokens floor (~9% of budget). - Validated to positive finite numbers; bad values are normalized to null (no precise stop) and legacy behavior is preserved. 2. CompactionEngine.compactUntilUnder - Now forwards `stopAtTokens: targetTokens` to compact() ONLY when the resolved target is strictly less than tokenBudget. This preserves the existing auth-recovery semantics that depend on force=true running the sweep to exhaustion (circuit-breaker test verifies multi-pass invocation). 3. LcmContextEngine.compact (CompactionExecutionParams) - New optional `compactionTargetFraction?: number` in (0, 1]. - When set + valid, becomes the effective target: targetTokens = floor(fraction * tokenBudget) and routes through the convergence loop (compactUntilUnder) which respects targetTokens. compactFullSweep is explicitly NOT used for fraction-target because it has no notion of a caller-supplied target (without the new stopAtTokens forwarding, which only compactUntilUnder uses). - Bad values (≤0, >1, non-finite, undefined) are silently ignored and the legacy compactionTarget enum is used. Tests: - 5 new validation tests in test/compaction-target-fraction.test.ts - 1615/1615 passing across the suite (including the circuit-breaker auth-recovery test that depends on legacy behavior when targetTokens == tokenBudget). Backward compat: - All new params are optional with sane defaults - When unset / invalid, behavior is byte-identical to pre-PR - Existing callers (afterTurn, deferred drain, overflow recovery) don't pass the new params and get unchanged semantics
…ompact PR follow-up to Martian-Engineering#619. Implements the lossless-claw side of the session_before_compact integration with pi-coding-agent (verified by a prior research agent: the SDK exposes this event as a true interception point, not a notification). When openclaw's PR #1 wires this to codex's compaction lifecycle, codex's GPT summarization call is BYPASSED entirely and our lossless compaction output is used as the replacement_history instead. Method signature on LcmContextEngine: interceptCompaction(request) → Promise< | { handled: true, summary, tokensAfter, firstKeptEntryId, tokensBefore } | { handled: false, reason: string } > Flow: 1. Five guard rails (ignored / stateless / no-target-fraction-configured / pre-compaction abort / mid-compaction abort) → handled:false 2. engine.compact() with force=true + the operator-configured compactionTargetFraction (defaults to 0.35 under OAuth profile via CODEX_OAUTH_DEFAULTS — applied at config layer in step 1's commit). 3. engine.assemble() to get the post-compaction AgentMessage[]. 4. Guard 5: if assembled.messages is empty, return handled:false so codex falls back to its native compaction (don't hand codex a near-empty summary). 5. Serialize AgentMessage[] → single string via the new helper serializeAssembledMessagesForCompaction. Each message becomes a `[role]\n<body>` block; arrays of content blocks join text blocks plainly and JSON-encode tool_use/tool_result/image blocks to preserve structure. 6. Return handled:true with the summary, echoing firstKeptEntryId and tokensBefore unchanged for codex's history-shape invariants. Defensive properties: - Never throws across the SDK boundary (try/catch wraps step 2-5; errors return handled:false with describeLogError). - Respects AbortSignal before and during compaction. - Honors all existing session-exclusion rules. Tests (9 new in test/intercept-compaction.test.ts): - handled:false for unset / invalid compactionTargetFraction - handled:false for ignored / stateless sessions - handled:false on pre-compaction abort - handled:false when LCM has no assembled context (e.g. unseeded conv) - Never throws on pathological inputs - Validation surface matches (0, 1] contract 1631/1631 tests passing across the suite. Backward compat: New method is purely additive. interceptCompaction is ONLY called by openclaw when PR #1 wires the session_before_compact handler; pre-PR-#1 plugin behavior is byte-identical.
PR follow-up to Martian-Engineering#619. Completes the OAuth profile by connecting detection at plugin registration time to the config resolver, so CODEX_OAUTH_DEFAULTS are applied when a Codex-configured host registers. Two changes: 1. detectCodexOAuthSync helper in plugin/index.ts - Best-effort SYNCHRONOUS detection at register-time - ANY-of signal detection: openclawDefaultModel prefix, summaryProvider/Model, expansionProvider/Model, largeFileSummaryProvider/Model - Strict prefix matching with case-insensitive + whitespace-trim handling - Defensive against non-string inputs - Why sync (not async): plugin register() is sync-shaped today; awaiting modelAuth.resolveApiKeyForProvider() would require touching many call sites. The v1 simplification accepts a slight false- positive surface (API-key Codex users get the OAuth profile unless they opt out via codexOAuthProfile: "off"). API-key codex is rare; trade-off favors getting OAuth users on the correct cadence today. - V2 plan: thread async detection through register() for strict `resolveApiKeyForProvider().mode === "oauth"` checking. 2. RuntimeModelAuthResult.mode field - Added optional mode?: string to the local type alias. - Mirrors `ResolvedProviderAuth.mode` in plugin-sdk/src/agents/model-auth-runtime-shared.d.ts. - Forward-compatible: when V2 lands, this field becomes the strict signal without typedef churn. 3. createLcmDependencies plumbing - Calls detectCodexOAuthSync(envSnapshot, pluginConfig) at the top of dep-building (after modelAuth resolves). - Passes the boolean to resolveLcmConfigWithDiagnostics(env, pc, oauthProfileActive) as the new third arg. - Failure path: wrapped in try/catch; detection errors log a warning and fall back to oauthProfileActive=false (legacy defaults). Plugin registration never blocks on detection. 4. Startup banner - When diagnostics.codexOAuthProfileApplied is true, logs an info-level banner once per process documenting the locked defaults (0.90 / 0.35 / hard-floor / deferred) and how to opt out. Operators can verify the profile is active from gateway logs. Tests (8 new in test/codex-oauth-detection.test.ts): - No signals → false (default) - openclawDefaultModel prefix match - summaryProvider / Model match - expansionProvider / Model + largeFileSummary* match - Non-string inputs (defensive) - Whitespace + case-insensitive normalization - ANY-OF semantics — one signal sufficient - Strict prefix — "openai-codex-mini/foo" does NOT match 1640/1640 tests passing across the suite. Backward compat: - Detection failure path leaves oauthProfileActive=false → legacy defaults preserved. - Hosts that don't use Codex see no behavior change. - Codex hosts can opt out via `codexOAuthProfile: "off"`.
PR follow-up to Martian-Engineering#619. Exposes the two new config fields via the plugin manifest so they appear in the openclaw plugin UI (with uiHints) and pass gateway-side schema validation (configSchema). uiHints additions: - codexOAuthProfile — explains "auto" detection + opt-out - compactionTargetFraction — explains pairing with contextThreshold for the accordion cadence configSchema additions: - codexOAuthProfile: string enum ["auto", "off"] - compactionTargetFraction: number in (0, 1] (exclusiveMinimum 0) 48 schema properties total. JSON parses cleanly. 1640/1640 tests passing (no test changes — schema is data, not code).
Three concrete behavioral fixes plus test rewrites, after the cross-PR
architecture review flagged the wiring as dead-on-arrival.
## P0: LCM was the silent reason intercept never fired
- `src/engine.ts`: add `interceptsCompaction: migrationOk` to `LcmContextEngine.info`.
This is the capability flag openclaw's compaction-intercept factory
gate reads (`engineInfo?.interceptsCompaction === true`). Without it,
the factory never registers and codex's native GPT compaction fires at
90% context instead of LCM's lossless path. The new docstring covers
why BOTH `ownsCompaction` AND `interceptsCompaction` are correct: they
advertise capability against distinct lanes (queued vs SDK event).
- `test/engine.test.ts`: new assertion `engine.info.interceptsCompaction === true`
alongside the existing `ownsCompaction` check, with a comment naming
the openclaw gate this flag drives.
## P0: detectCodexOAuthSync test exercised a local mirror, not the impl
- `src/plugin/index.ts`: add `__test_only_detectCodexOAuthSync` re-export
so tests can import the REAL function. Prior tests duplicated the logic
locally — a divergence between implementation and contract would have
produced false-confidence green tests.
- `test/codex-oauth-detection.test.ts`: replace the local `detectMirror`
copy with `import { __test_only_detectCodexOAuthSync as detectMirror }
from "../src/plugin/index.js"`. The 9 cases now exercise the actual
detection logic; the test surface stays identical so regressions surface.
## P1: 0.05 safety floor on compactionTargetFraction
- `src/engine.ts`: tighten the validation range from `(0, 1]` to `[0.05, 1]`
with a warning log on under-the-floor values. Rationale: fractions
below 0.05 (≈ 12.8K tokens on a 258K window) undercut system-prompt +
tool-defs overhead on most contexts, causing the convergence loop to
spin until `maxRounds`. The most likely cause of such inputs is a
user/config typo (e.g. `0.05` vs `0.5`), so we conservatively bail to
the enum-based default.
- `src/tools/lcm-compact-tool.ts`: mirror the floor in the `lcm_compact`
tool's `targetFraction` validator (defense in depth at the agent-tool
boundary).
## Tests: replace no-op math tests with real contract coverage
- `test/compaction-target-fraction.test.ts`: rewrite from 3 mirror-math
tests to 21 tests across 4 groups:
1. compactFullSweep input contract: parameterized smoke test that
compactFullSweep accepts undefined / 0 / -1 / NaN / ±Infinity /
positive integers / fractional / 1e9 without throwing.
2. compactionTargetFraction validation predicate (now [0.05, 1]).
3. fraction → stopAtTokens conversion (floor(fraction * tokenBudget))
against the canonical Codex OAuth values.
4. compactFullSweep internal normalization predicate.
Local re-verification: 267/267 tests passed (4.14s) across the affected
files. The Phase 1/Phase 2 stop-loop branches in compactFullSweep are
exercised end-to-end via test/intercept-compaction.test.ts (engine-level)
and via the live integration tests in test/v41-* (production-shape).
Concrete fixes from the second-wave audit: ## P1 — unify the 0.05 safety floor across all validation gates - `src/engine.ts` (Guard 2 in `interceptCompaction`, line ~7515): tighten validation from `(0, 1]` to `[0.05, 1]`. Wave-B B4 caught the divergence: a fraction in (0, 0.05) used to pass Guard 2 (proceeded with intercept), then trigger a warning + fallback inside `compact()`. Now Guard 2 short-circuits before the LLM call — saves the wasted token cost. Aligns with the engine.compact() floor and the lcm_compact tool's validator. - `src/tools/lcm-compact-tool.ts` (TypeBox schema): change `exclusiveMinimum: 0` → `minimum: 0.05`. Wave-B B2 caught that the agent-facing schema declared a wider range than the runtime accepted; agents passing 0.02 used to pass schema validation, then get silent fallback with no error signal. Schema-level enforcement now matches runtime enforcement, and the updated description names the rationale. - `test/v41-lcm-compact-tool.test.ts`: update the schema-assertion test to match the tightened bound. ## P1 — dedup the under-floor warning log - `src/engine.ts` (LcmContextEngine): add `warnedFloorFractions: Set<number>` field and `warnBelowFloorOnce()` private method. Codex autonomous loops fire compaction 1-5× per turn; a misconfigured `compactionTargetFraction = 0.02` would otherwise log dozens of identical warnings per session. Dedup is per-fraction-value (not per-session) — a DIFFERENT bad number gets a fresh warning because the operator may be iterating on the config. The warning text names the dedup behavior explicitly so operators understand why they don't see a 2nd warning. Local re-verification: 291/291 tests passed on the affected files after the changes. Aggregate test count grew from 267 (post-wave-A) as the new tests in extensions.test.ts (openclaw side) drove additions here.
…lowFloorOnce dedup Wave-3 audit flagged that two wave-B behavioral fixes had no regression test coverage — silent revert would not be caught by any existing test. This commit adds 5 new tests: ## 0.05 floor regression guards (3 tests) - `interceptCompaction returns handled:false for compactionTargetFraction below 0.05` (0.01, 0.02, 0.04, 0.0499): asserts Guard 2 unification with the engine.compact() floor (wave-B B4). - `interceptCompaction accepts compactionTargetFraction at the 0.05 floor (boundary)`: asserts the at-floor value is accepted by Guard 2 (downstream reason may vary by session state, but the boundary check itself MUST pass). ## warnBelowFloorOnce dedup (3 tests) - `warns once for repeated calls with the same fraction value`: asserts the dedup Set behavior — three identical calls produce one warning. - `warns separately for DIFFERENT fraction values`: asserts dedup is per-fraction-value, not session-wide — operators iterating on the config see fresh feedback. - `warning text names the dedup behavior`: asserts the warning message includes language explaining why subsequent calls are silent (so operators don't misread "no further warnings" as "fix accepted"). - `records the fraction value in the internal dedup set`: white-box guard against accidental Set replacement during refactor. ## Test-infrastructure changes - `makeMinimalDeps` and `makeEngine` accept an optional `logOverrides` parameter so tests can inject a `vi.fn()` spy for `log.warn` etc. - The warnBelowFloorOnce tests exercise the private method via a TypeScript cast to `EngineWithPrivates` (the dedup logic doesn't need a real conversation seeded, only direct invocation). Also fixes a path bug in `warnBelowFloorOnce`: was `this.log.warn`, should be `this.deps.log.warn` (matches the rest of the engine class — the engine accesses logs via deps, not a class field). 297/297 tests pass on the affected files after the changes.
5c4f9dd to
1bc455b
Compare
Collaborator
Author
|
Maintainer review: needs rebase before deeper review. Summary Next step Evidence checked
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Pairs with openclaw/openclaw#81164 — an upstream openclaw PR that adds the
ContextEngine.interceptCompactioncontract and wires it into pi-coding-agent'ssession_before_compactSDK event. Together they implement an accordion compaction cadence for Codex autonomous loops: agent fills to ~90% → openclaw routes the compaction event to LCM → LCM losslessly compacts to ~35% → agent resumes with an LCM-derived summary instead of codex's lossy GPT one.Current state (without these PRs): codex's native GPT compaction fires at 90%, summarizes the whole history into a single short string, replaces history. Repeated mid-turn for 100-1000-call autonomous loops, this compounds lossiness until almost no original context remains.
Architecture
The Codex OAuth profile (this PR) locks the cadence values when openclaw detects codex:
contextThresholdcompactionTargetFractionproactiveThresholdCompactionModeOperator opt-out:
codexOAuthProfile: "off"returns to legacy behavior.What's in this PR
8 commits, 10 files, +2460/-12 lines — rebased clean onto
martian/main:feat(config)LcmConfigfields:compactionTargetFraction?,codexOAuthProfile?.CODEX_OAUTH_DEFAULTSconstant (0.90 trigger / 0.35 floor / deferred). NewoauthProfileActiveparam toresolveLcmConfigWithDiagnostics. NewcodexOAuthProfileApplieddiagnostic.feat(compaction)stopAtTokensparam onCompactionEngine.compact+compactFullSweepfor precise stop points.compactionTargetFractionplumbed throughexecuteCompactionCore.compactUntilUnderforwardsstopAtTokensonly when target is strictly less than tokenBudget (preserves legacy auth-recovery semantics).feat(engine)engine.interceptCompaction()method. 5 guard rails + lcm-produced-no-context bailout. Never throws (returnshandled:falseon internal error).info.interceptsCompaction = migrationOkso openclaw's compaction-intercept gate registers the handler.feat(plugin)detectCodexOAuthSynchelper. Plumbed into config resolver asoauthProfileActive. Startup banner when profile applied.__test_only_detectCodexOAuthSyncexport so tests exercise the real implementation.feat(manifest)openclaw.plugin.jsonconfigSchema + uiHints for both new fields.fix(intercept)wave-Ainfo.interceptsCompaction = true(the missing capability flag the audit caught). 0.05 safety floor oncompactionTargetFraction.__test_only_detectCodexOAuthSyncexport. Behavioral test rewrite forstopAtTokens.fix(intercept)wave-B[0.05, 1]. Warning log deduped viawarnBelowFloorOnce(private helper + per-fraction Set).test(intercept)wave-3warnBelowFloorOncededup behavior.Adversarial review — three waves
Three independent adversarial review waves were performed by separate research agents:
interceptsCompactionflag so the openclaw wiring was dead), test-quality gaps, and a 0.05 floor sanity check. All fixed.lcm_compacttool, warning log spam under bursty loops, weak test surface. All fixed.How the values were chosen
(context_window * 9) / 10incodex-rs/protocol/src/openai_models.rs:322).20000 / context_window(~8% of a 258K window). We target HIGHER because we preserve lineage on disk + drilldown vialcm_describeworks. Keeping ~35% of budget filled with summaries leaves ~65% free for the next compaction cycle.Tests
953/953 passing locally on commit
685fe38. The new contract is exercised by:test/engine.test.ts— 249 cases (includinginterceptsCompactioncapability assertion)test/intercept-compaction.test.ts— 15 cases (guard rails, return shape, never-throws, dedup regression)test/compaction-target-fraction.test.ts— 21 cases (validation predicate, fraction → stopAtTokens conversion, normalization predicate, input-contract no-throw smoke)test/codex-oauth-detection.test.ts— 9 cases (real impl via__test_only_detectCodexOAuthSyncexport, no more local-mirror false confidence)test/config.test.ts— 69 cases (including 5 new OAuth-profile cases)Backward compat
Every new field and method is optional with a sane default:
compactionTargetFractionundefined → legacy "compact to threshold" behaviorcodexOAuthProfileundefined → defaults to"auto", but applies nothing unlessoauthProfileActiveis trueinterceptCompactionnot called by openclaw until openclaw#81164 is merged → no behavior change pre-mergeHosts that don't use codex see ZERO behavior change. Codex hosts get the new defaults automatically once the upstream PR lands, with an explicit opt-out (
codexOAuthProfile: "off").Scope notes
This PR is intentionally scoped to just the Codex OAuth profile +
interceptCompactioncontract. Two related changes were intentionally excluded to keep this PR independently reviewable:lcm_compacttool'stargetFractionparameter — the original 7-commit working branch included a 3-commit chain onsrc/tools/lcm-compact-tool.ts(addtargetFractionparam + raise per-window cap 2→10 + wave-A/B floor mirroring + schema tightening). That tool itself doesn't exist onmainyet (it's added by #613), so those changes will be filed as a separate small follow-up PR once feat(lcm): v4.1 —LCM V2 (replaces #516; companion #616 deferred) #613 lands. Nothing in this PR functionally depends on the tool — agents can drive the same accordion cadence via the OAuth profile defaults.respectThresholdAsHardFloor = truein the OAuth defaults — the canonical OAuth profile (per the original commit message) sets the hard-floor flag, but that field is added by #619 which is also pending. Without feat(compaction): opt-in hard-floor for incremental compaction below contextThreshold #619, the OAuth defaults still install the 90%/35% accordion viacontextThreshold+compactionTargetFraction, but cold-cache catch-up compactions are not floor-blocked. A// NOTE:inCODEX_OAUTH_DEFAULTSdocuments this and points at feat(compaction): opt-in hard-floor for incremental compaction below contextThreshold #619. Once feat(compaction): opt-in hard-floor for incremental compaction below contextThreshold #619 lands, a 1-line follow-up adds the field to the defaults.The net effect: this PR delivers the complete codex-mediated intercept flow (the user-visible behavior change) without bundling unrelated work or transitively depending on two other open PRs.
Test plan
martian/main— no merge conflictscodexOAuthProfile: "off"returns to legacy behavior