fix: harden bootstrap budget against oversized messages and NaN config by 100yenadmin · Pull Request #258 · Martian-Engineering/lossless-claw

100yenadmin · 2026-04-03T21:41:51Z

Problem

Two bugs in the bootstrap budget cap introduced in #255:

1. Oversized single tail message bypasses the budget

The trim loop condition if (kept.length > 0 && totalTokens + tokenCount > safeMaxTokens) means the first message processed (the newest, at the tail) is always kept regardless of its token count.

Reproduction:

// Single 5000-char message (~1250 tokens) with 100-token budget
trimBootstrapMessagesToBudget([{ role: 'assistant', content: 'x'.repeat(5000) }], 100)
// Before fix: returns [message]  (1250 tokens, 12.5x over budget)
// After fix:  returns []          (budget respected)

This matters in practice because tool results and large assistant responses can easily be 50K+ tokens as the tail message.

2. NaN propagates through all numeric env config parsing

Every numeric env var in resolveLcmConfig() uses raw parseInt() / parseFloat():

(env.LCM_LEAF_CHUNK_TOKENS !== undefined ? parseInt(env.LCM_LEAF_CHUNK_TOKENS, 10) : undefined)
  ?? toNumber(pc.leafChunkTokens) ?? 20000

parseInt('oops', 10) returns NaN, which is not nullish, so ?? never fires. This propagates NaN through leafChunkTokens, bootstrapMaxTokens, and every derived value — effectively disabling all token budgets system-wide.

Reproduction:

const config = resolveLcmConfig({ LCM_LEAF_CHUNK_TOKENS: 'oops' }, {});
// Before fix: { leafChunkTokens: NaN, bootstrapMaxTokens: NaN }
// After fix:  { leafChunkTokens: 20000, bootstrapMaxTokens: 6000 }

Fix

Oversized singleton: After the trim loop, check if the single kept message exceeds the budget. If so, return empty array instead of silently bypassing.
NaN propagation: Add parseFiniteInt() and parseFiniteNumber() helpers that return undefined for non-finite results. Replace all 16 raw parseInt/parseFloat calls in resolveLcmConfig() with the safe helpers, so ?? fallback works correctly for invalid input.

Testing

Both syntax checks pass
Both bugs reproduced and verified fixed with minimal scripts
Adversarial review by independent agent confirmed both issues

Two bugs in the bootstrap budget cap introduced in Martian-Engineering#255: 1. A single oversized tail message bypasses the budget entirely. The trim loop condition 'if (kept.length > 0 && ...)' means the first message (newest) is always kept regardless of size. A 50K-token tool result as the last message will bypass a 6K budget. Fix: after the loop, check if the single kept message exceeds budget and return empty instead of silently bypassing. 2. NaN propagates through all numeric env config parsing. parseInt('oops', 10) returns NaN, which is not nullish, so ?? fallback never fires. Invalid env like LCM_LEAF_CHUNK_TOKENS=oops propagates NaN through leafChunkTokens, bootstrapMaxTokens, and every derived config value — effectively disabling all token budgets. Fix: add parseFiniteInt/parseFiniteNumber helpers that return undefined for non-finite results. Replace all 16 raw parseInt/parseFloat calls in resolveLcmConfig() with the safe helpers. Both bugs were found and reproduced with minimal scripts during adversarial review of a production incident.

Copilot

Pull request overview

Hardens LCM bootstrap token budgeting and configuration parsing to prevent budget bypasses from oversized tail messages and to stop invalid numeric env vars from propagating NaN through token-budget computations.

Changes:

Prevents a single oversized newest bootstrap message from being kept when it exceeds the bootstrap token budget.
Introduces parseFiniteInt() / parseFiniteNumber() and uses them across resolveLcmConfig() so invalid env numeric values fall back correctly via ??.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`src/engine.ts`	Adds a post-trim guard to drop an oversized singleton tail message instead of bypassing the bootstrap budget.
`src/db/config.ts`	Adds safe env parsing helpers and replaces raw `parseInt`/`parseFloat` usage to avoid `NaN` propagation in resolved config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-03T21:45:22Z

+  // If a single oversized tail message exceeds the budget, return empty
+  // rather than silently bypassing the budget cap. An empty bootstrap is
+  // safer than an exploding one.
+  if (kept.length === 1 && totalTokens > safeMaxTokens) {
+    return [];
+  }


Returning [] here can also mean “messages exist but newest message exceeds the bootstrap budget”. The first-time bootstrap caller currently treats bootstrapMessages.length === 0 as “no leaf-path messages in session” and persists state accordingly, which becomes a misleading result/message after this change. Consider signaling this case explicitly (e.g., return a distinct status/reason from trimBootstrapMessagesToBudget, or have the caller branch on historicalMessages.length > 0 to return a more accurate reason like “tail message exceeds bootstrapMaxTokens”).

Copilot · 2026-04-03T21:45:23Z

+  // If a single oversized tail message exceeds the budget, return empty
+  // rather than silently bypassing the budget cap. An empty bootstrap is
+  // safer than an exploding one.
+  if (kept.length === 1 && totalTokens > safeMaxTokens) {
+    return [];


This change makes trimBootstrapMessagesToBudget() potentially return an empty array even when messages.length > 0, which contradicts the function’s doc comment (“The newest message is always preserved…”). Please update the documentation/contract to reflect the new oversized-tail behavior so callers don’t assume non-empty output when input is non-empty.

Copilot · 2026-04-03T21:45:23Z

+  // If a single oversized tail message exceeds the budget, return empty
+  // rather than silently bypassing the budget cap. An empty bootstrap is
+  // safer than an exploding one.
+  if (kept.length === 1 && totalTokens > safeMaxTokens) {
+    return [];
+  }


There’s existing bootstrap token-cap coverage in test/engine.test.ts, but this new edge case (single newest message exceeds budget => empty bootstrap) isn’t covered. Adding a regression test for an oversized tail message would help prevent reintroducing the bypass (and validate the expected behavior when bootstrapMessages becomes empty).

Copilot · 2026-04-03T21:45:23Z

+/** Safely parse a finite integer from an environment string, or return undefined.
+ *  Unlike raw parseInt(), this returns undefined for NaN so ?? fallback works. */
+function parseFiniteInt(value: string | undefined): number | undefined {
+  if (value === undefined) return undefined;
+  const parsed = parseInt(value, 10);
+  return Number.isFinite(parsed) ? parsed : undefined;
+}
+
+/** Safely parse a finite float from an environment string, or return undefined. */
+function parseFiniteNumber(value: string | undefined): number | undefined {
+  if (value === undefined) return undefined;
+  const parsed = parseFloat(value);
+  return Number.isFinite(parsed) ? parsed : undefined;
+}


The NaN-hardening here looks correct, but there’s no automated coverage ensuring invalid numeric env vars (e.g. LCM_LEAF_CHUNK_TOKENS='oops', LCM_BOOTSTRAP_MAX_TOKENS='oops') fall back to plugin/default values instead of producing NaN. Please add tests in test/config.test.ts for at least one int and one float env var to lock in the new parseFiniteInt/parseFiniteNumber behavior.

Add focused regression tests for the oversized singleton bootstrap tail case and invalid numeric env parsing fallback behavior. Add a patch changeset because this PR changes runtime behavior and should be reflected in release notes. Regeneration-Prompt: | The open PR fixed two production regressions but still lacked the release and test follow-through needed to merge. Add targeted regression coverage instead of broad refactors: one config test that proves invalid numeric env values like LCM_LEAF_CHUNK_TOKENS=oops fall back through plugin/default resolution, and one bootstrap test that proves a single oversized tail message is dropped instead of bypassing bootstrapMaxTokens. Also add a patch changeset because the PR changes runtime behavior visible to users and maintainers expect release notes coverage for that.

jalehman · 2026-04-03T22:04:13Z

Thank you!

The previous docstring claimed 'The newest message is always preserved', which is no longer true after the oversized-singleton guard. Updated to document the new behavior: returns empty when only one message exists and it exceeds the budget. Addresses review feedback on Martian-Engineering#258.

100yenadmin · 2026-04-03T22:06:33Z

Thanks for the review feedback — addressed the docstring inconsistency (pushed).

On the test coverage requests (oversized-tail regression test + NaN env config fallback test): both are valid and I'd be happy to add them, but I don't have test infrastructure wired up in the fork. If you'd prefer I add the test cases in a follow-up commit, could you point me at the right test runner setup? Otherwise I can describe the test cases and let a maintainer add them:

Oversized tail test (engine.test.ts):

// Single message exceeding bootstrapMaxTokens should return empty
const result = trimBootstrapMessagesToBudget(
  [{ role: 'assistant', content: 'x'.repeat(5000) }], // ~1250 tokens
  100
);
expect(result).toEqual([]);

NaN config fallback test (config.test.ts):

// Invalid env should fall back to defaults, not NaN
const config = resolveLcmConfig({ LCM_LEAF_CHUNK_TOKENS: 'oops' }, {});
expect(config.leafChunkTokens).toBe(20000);
expect(Number.isFinite(config.bootstrapMaxTokens)).toBe(true);

const config2 = resolveLcmConfig({ LCM_CONTEXT_THRESHOLD: 'invalid' }, {});
expect(config2.contextThreshold).toBe(0.75);

Copilot AI review requested due to automatic review settings April 3, 2026 21:41

Copilot started reviewing on behalf of 100yenadmin April 3, 2026 21:42 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

jalehman merged commit cd18739 into Martian-Engineering:main Apr 3, 2026
2 checks passed

github-actions Bot mentioned this pull request Apr 3, 2026

chore: version packages #231

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: harden bootstrap budget against oversized messages and NaN config#258

fix: harden bootstrap budget against oversized messages and NaN config#258
jalehman merged 2 commits into
Martian-Engineering:mainfrom
electricsheephq:fix/bootstrap-budget-nan-and-oversized

100yenadmin commented Apr 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Copilot AI Apr 3, 2026

Uh oh!

Uh oh!

jalehman commented Apr 3, 2026

Uh oh!

100yenadmin commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

100yenadmin commented Apr 3, 2026

Problem

1. Oversized single tail message bypasses the budget

2. NaN propagates through all numeric env config parsing

Fix

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jalehman commented Apr 3, 2026

Uh oh!

100yenadmin commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants