fix(agents): reevaluate preflight compaction on fresh totals#65622
fix(agents): reevaluate preflight compaction on fresh totals#65622neeravmakwana wants to merge 2 commits into
Conversation
🔒 Aisle Security AnalysisWe found 2 potential security issue(s) in this PR:
1. 🟠 Denial of Service via full transcript read/parse during preflight compaction token estimation
Description
Impact:
Vulnerable code (fallback trigger): const transcriptPromptTokens = hasUsableFreshPersistedTokens
? undefined
: estimatePromptTokensFromSessionTranscript({
sessionId: entry.sessionId,
storePath: params.storePath,
sessionFile: entry.sessionFile ?? params.followupRun.run.sessionFile,
});Data-flow context:
Even though the heavy read/parse occurs in RecommendationAvoid full transcript reads/parses on the request path. Recommended mitigations (combine as appropriate):
Example (size cap + fast-path): import fs from "node:fs";
function safeEstimateFromTranscript(logPath: string, maxBytes = 5_000_000) {
const stat = fs.statSync(logPath);
if (stat.size > maxBytes) return undefined; // or force compaction
// ...stream parse instead of readFileSync
}Also ensure cached 2. 🟠 Arbitrary file read via unvalidated `sessionFile` when estimating tokens from session transcript
Description
This can enable arbitrary file read / information disclosure (e.g., reading Vulnerable call site: const transcriptPromptTokens = hasUsableFreshPersistedTokens
? undefined
: estimatePromptTokensFromSessionTranscript({
sessionId: entry.sessionId,
storePath: params.storePath,
sessionFile: entry.sessionFile ?? params.followupRun.run.sessionFile,
});Path resolution issue when // resolveSessionTranscriptCandidates(...)
const trimmed = sessionFile.trim();
if (trimmed) {
candidates.push(path.resolve(trimmed));
}RecommendationConstrain transcript reads to an allowed directory and reject/ignore arbitrary Recommended defense-in-depth:
Example fix (approach: require const safeSessionFile = params.storePath
? resolveSessionFilePath(entry.sessionId, { sessionFile: entry.sessionFile ?? params.followupRun.run.sessionFile }, { sessionsDir: path.dirname(params.storePath) })
: undefined;
const transcriptPromptTokens = hasUsableFreshPersistedTokens
? undefined
: estimatePromptTokensFromSessionTranscript({
sessionId: entry.sessionId,
storePath: params.storePath,
sessionFile: safeSessionFile,
});And in // Do not accept arbitrary path.resolve(sessionFile) when storePath is missing.Analyzed PR: #65622 at commit Last updated on: 2026-04-13T01:17:47Z |
Greptile SummaryThis PR fixes a long-standing silent gap in preflight compaction: the previous code returned early whenever
Confidence Score: 4/5Safe to merge after fixing the mock — the runtime fix is correct but the regression test will fail as written. The core implementation change is correct and well-reasoned. One P1 issue: the new test's mock doesn't apply tokensAfter to update totalTokens, causing an assertion failure that blocks CI. Fix the mock and the PR is ready. src/auto-reply/reply/agent-runner-memory.test.ts — the incrementCompactionCountMock needs a tokensAfter branch
|
|
Follow-up on the AI review comments:
Verification rerun after the follow-up changes:
Pushed in |
|
Related work from PRtags group Title: Preflight compaction skips fresh token totals
|
Summary
totalTokensvalue, so later turns never re-checked the threshold after the first successful reply.runPreflightCompactionIfNeeded()and added a regression test that proves fresh cached totals can still trigger preflight compaction.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
runPreflightCompactionIfNeeded()treated fresh persisted totals as a reason to skip the preflight check entirely, even thoughSessionEntry.totalTokensis the cached context snapshot that preflight should use for its threshold gate.shouldRunPreflightCompaction(), but no regression test covering the outer runtime path where fresh cached totals short-circuited before the helper could run.Regression Test Plan (if applicable)
src/auto-reply/reply/agent-runner-memory.test.tstotalTokensFresh: trueand a cached total already above threshold still runs preflight compaction on the next turn.runPreflightCompactionIfNeeded()itself rather than only the pure threshold helper.src/auto-reply/reply/reply-state.test.tsalready covers the helper-level threshold behavior.User-visible / Behavior Changes
Preflight safeguard compaction can trigger again on later turns when a session's fresh cached context total is already over the configured threshold, instead of only working before the first successful reply.
Diagram (if applicable)
Security Impact (required)
Yes/No) NoYes/No) NoYes/No) NoYes/No) NoYes/No) NoYes, explain risk + mitigation:Repro + Verification
Environment
anthropic/claude-opus-4-6andopenai/gpt-5.4in focused testsagents.defaults.compaction.mode: safeguard,agentCfgContextTokens: 100000Steps
totalTokenscontext snapshot after a successful reply.runPreflightCompactionIfNeeded()evaluates the threshold or returns early.Expected
Actual
totalTokensFreshwas already true.Evidence
Human Verification (required)
pnpm test src/auto-reply/reply/agent-runner-memory.test.ts; ranpnpm test src/auto-reply/reply/reply-state.test.ts.reply-state.test.ts; kept post-compaction token persistence behavior intact in the new runtime test.pnpm buildis currently failing in an unrelated existingextensions/acpx/src/runtime.tsexport-resolution path (Could not resolve 'acpx/runtime');pnpm checkadvanced past the initial missing-madgeenvironment issue afterpnpm installbut then stalled duringcheck:madge-import-cycles;pnpm testprogressed through multiple shards and then stalled later invitest.extension-channelsoutside this touched area.Review Conversations
Compatibility / Migration
Yes/No) YesYes/No) NoYes/No) NoRisks and Mitigations
Additional Notes
Made with Cursor