fix: harden bootstrap budget against oversized messages and NaN config#258
Conversation
Two bugs in the bootstrap budget cap introduced in Martian-Engineering#255: 1. A single oversized tail message bypasses the budget entirely. The trim loop condition 'if (kept.length > 0 && ...)' means the first message (newest) is always kept regardless of size. A 50K-token tool result as the last message will bypass a 6K budget. Fix: after the loop, check if the single kept message exceeds budget and return empty instead of silently bypassing. 2. NaN propagates through all numeric env config parsing. parseInt('oops', 10) returns NaN, which is not nullish, so ?? fallback never fires. Invalid env like LCM_LEAF_CHUNK_TOKENS=oops propagates NaN through leafChunkTokens, bootstrapMaxTokens, and every derived config value — effectively disabling all token budgets. Fix: add parseFiniteInt/parseFiniteNumber helpers that return undefined for non-finite results. Replace all 16 raw parseInt/parseFloat calls in resolveLcmConfig() with the safe helpers. Both bugs were found and reproduced with minimal scripts during adversarial review of a production incident.
There was a problem hiding this comment.
Pull request overview
Hardens LCM bootstrap token budgeting and configuration parsing to prevent budget bypasses from oversized tail messages and to stop invalid numeric env vars from propagating NaN through token-budget computations.
Changes:
- Prevents a single oversized newest bootstrap message from being kept when it exceeds the bootstrap token budget.
- Introduces
parseFiniteInt()/parseFiniteNumber()and uses them acrossresolveLcmConfig()so invalid env numeric values fall back correctly via??.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/engine.ts |
Adds a post-trim guard to drop an oversized singleton tail message instead of bypassing the bootstrap budget. |
src/db/config.ts |
Adds safe env parsing helpers and replaces raw parseInt/parseFloat usage to avoid NaN propagation in resolved config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // If a single oversized tail message exceeds the budget, return empty | ||
| // rather than silently bypassing the budget cap. An empty bootstrap is | ||
| // safer than an exploding one. | ||
| if (kept.length === 1 && totalTokens > safeMaxTokens) { | ||
| return []; | ||
| } |
There was a problem hiding this comment.
Returning [] here can also mean “messages exist but newest message exceeds the bootstrap budget”. The first-time bootstrap caller currently treats bootstrapMessages.length === 0 as “no leaf-path messages in session” and persists state accordingly, which becomes a misleading result/message after this change. Consider signaling this case explicitly (e.g., return a distinct status/reason from trimBootstrapMessagesToBudget, or have the caller branch on historicalMessages.length > 0 to return a more accurate reason like “tail message exceeds bootstrapMaxTokens”).
| // If a single oversized tail message exceeds the budget, return empty | ||
| // rather than silently bypassing the budget cap. An empty bootstrap is | ||
| // safer than an exploding one. | ||
| if (kept.length === 1 && totalTokens > safeMaxTokens) { | ||
| return []; |
There was a problem hiding this comment.
This change makes trimBootstrapMessagesToBudget() potentially return an empty array even when messages.length > 0, which contradicts the function’s doc comment (“The newest message is always preserved…”). Please update the documentation/contract to reflect the new oversized-tail behavior so callers don’t assume non-empty output when input is non-empty.
| // If a single oversized tail message exceeds the budget, return empty | ||
| // rather than silently bypassing the budget cap. An empty bootstrap is | ||
| // safer than an exploding one. | ||
| if (kept.length === 1 && totalTokens > safeMaxTokens) { | ||
| return []; | ||
| } |
There was a problem hiding this comment.
There’s existing bootstrap token-cap coverage in test/engine.test.ts, but this new edge case (single newest message exceeds budget => empty bootstrap) isn’t covered. Adding a regression test for an oversized tail message would help prevent reintroducing the bypass (and validate the expected behavior when bootstrapMessages becomes empty).
| /** Safely parse a finite integer from an environment string, or return undefined. | ||
| * Unlike raw parseInt(), this returns undefined for NaN so ?? fallback works. */ | ||
| function parseFiniteInt(value: string | undefined): number | undefined { | ||
| if (value === undefined) return undefined; | ||
| const parsed = parseInt(value, 10); | ||
| return Number.isFinite(parsed) ? parsed : undefined; | ||
| } | ||
|
|
||
| /** Safely parse a finite float from an environment string, or return undefined. */ | ||
| function parseFiniteNumber(value: string | undefined): number | undefined { | ||
| if (value === undefined) return undefined; | ||
| const parsed = parseFloat(value); | ||
| return Number.isFinite(parsed) ? parsed : undefined; | ||
| } |
There was a problem hiding this comment.
The NaN-hardening here looks correct, but there’s no automated coverage ensuring invalid numeric env vars (e.g. LCM_LEAF_CHUNK_TOKENS='oops', LCM_BOOTSTRAP_MAX_TOKENS='oops') fall back to plugin/default values instead of producing NaN. Please add tests in test/config.test.ts for at least one int and one float env var to lock in the new parseFiniteInt/parseFiniteNumber behavior.
Add focused regression tests for the oversized singleton bootstrap tail case and invalid numeric env parsing fallback behavior. Add a patch changeset because this PR changes runtime behavior and should be reflected in release notes. Regeneration-Prompt: | The open PR fixed two production regressions but still lacked the release and test follow-through needed to merge. Add targeted regression coverage instead of broad refactors: one config test that proves invalid numeric env values like LCM_LEAF_CHUNK_TOKENS=oops fall back through plugin/default resolution, and one bootstrap test that proves a single oversized tail message is dropped instead of bypassing bootstrapMaxTokens. Also add a patch changeset because the PR changes runtime behavior visible to users and maintainers expect release notes coverage for that.
|
Thank you! |
The previous docstring claimed 'The newest message is always preserved', which is no longer true after the oversized-singleton guard. Updated to document the new behavior: returns empty when only one message exists and it exceeds the budget. Addresses review feedback on Martian-Engineering#258.
|
Thanks for the review feedback — addressed the docstring inconsistency (pushed). On the test coverage requests (oversized-tail regression test + NaN env config fallback test): both are valid and I'd be happy to add them, but I don't have test infrastructure wired up in the fork. If you'd prefer I add the test cases in a follow-up commit, could you point me at the right test runner setup? Otherwise I can describe the test cases and let a maintainer add them: Oversized tail test ( // Single message exceeding bootstrapMaxTokens should return empty
const result = trimBootstrapMessagesToBudget(
[{ role: 'assistant', content: 'x'.repeat(5000) }], // ~1250 tokens
100
);
expect(result).toEqual([]);NaN config fallback test ( // Invalid env should fall back to defaults, not NaN
const config = resolveLcmConfig({ LCM_LEAF_CHUNK_TOKENS: 'oops' }, {});
expect(config.leafChunkTokens).toBe(20000);
expect(Number.isFinite(config.bootstrapMaxTokens)).toBe(true);
const config2 = resolveLcmConfig({ LCM_CONTEXT_THRESHOLD: 'invalid' }, {});
expect(config2.contextThreshold).toBe(0.75); |
Problem
Two bugs in the bootstrap budget cap introduced in #255:
1. Oversized single tail message bypasses the budget
The trim loop condition
if (kept.length > 0 && totalTokens + tokenCount > safeMaxTokens)means the first message processed (the newest, at the tail) is always kept regardless of its token count.Reproduction:
This matters in practice because tool results and large assistant responses can easily be 50K+ tokens as the tail message.
2. NaN propagates through all numeric env config parsing
Every numeric env var in
resolveLcmConfig()uses rawparseInt()/parseFloat():parseInt('oops', 10)returnsNaN, which is not nullish, so??never fires. This propagatesNaNthroughleafChunkTokens,bootstrapMaxTokens, and every derived value — effectively disabling all token budgets system-wide.Reproduction:
Fix
Oversized singleton: After the trim loop, check if the single kept message exceeds the budget. If so, return empty array instead of silently bypassing.
NaN propagation: Add
parseFiniteInt()andparseFiniteNumber()helpers that returnundefinedfor non-finite results. Replace all 16 rawparseInt/parseFloatcalls inresolveLcmConfig()with the safe helpers, so??fallback works correctly for invalid input.Testing