feat: compaction circuit breaker with auth failure detection#230
Merged
jalehman merged 4 commits intoApr 1, 2026
Merged
Conversation
…orms - Track consecutive auth failures in LcmContextEngine - Trip after configurable threshold (default 5) consecutive failures - Auto-reset after configurable cooldown (default 30 min) - Reset on successful compaction - Gate both compact() and compactLeafAsync() entry points - Propagate authFailure signal through CompactionResult and compactUntilUnder - Config: circuitBreakerThreshold, circuitBreakerCooldownMs (env + plugin config) - Fix: remove duplicate summaryModel/summaryProvider fields in LcmConfig type - Add 5 behavioral tests (trip, block, leaf-block, auto-reset, success-reset)
…Auth The false-positive circuit breaker fix (fcaccfc) only applied requireStructuralSignal to the primary success path in attemptSummarizerCall. The retry path in retryWithoutModelAuth still called extractProviderAuthFailure() without it, so valid summaries containing auth-error phrases (e.g. summaries of sessions discussing auth fixes) would false-trigger the circuit breaker on retry. Apply requireStructuralSignal: true to both the success and catch paths inside retryWithoutModelAuth (lines ~876 and ~896 in source). The primary catch path (line ~994 in dist) correctly remains unguarded since thrown exceptions carry structural signals (HTTP 401, error.kind). 460 tests pass. Audited all 4 call sites.
Scope compaction auth circuit-breaker state to the resolved summarizer identity instead of the shared engine instance, propagate auth failures out of full-sweep compaction even after partial progress, and add regression coverage plus a patch changeset. Regeneration-Prompt: | User asked to implement the review findings directly on PR Martian-Engineering#230 in the existing review worktree, then add a changeset and push back to the contributor branch. The two concrete bugs were: the compaction auth circuit breaker lived on the singleton LcmContextEngine even though summarizer/provider selection is resolved per request, and compactFullSweep hid provider auth failures when an earlier sweep pass had already made some progress. Preserve the feature and keep the change additive. Key the breaker by the effective summarizer identity used for compaction so one broken provider or auth profile does not block unrelated sessions, providers, or models. For custom injected summarize functions, keep breaker scope local to the session queue. Make full-sweep compaction surface authFailure even after partial compaction so callers do not treat it as a clean success and reset the breaker. Update engine result handling and add focused regression tests for provider scoping and the partial-progress full-sweep path. Add a patch changeset because this changes runtime compaction behavior.
Contributor
|
Thank you! |
This was referenced Apr 1, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a circuit breaker to prevent compaction retry storms when the LLM provider returns auth errors (e.g. expired API key, revoked token). Without this, a single auth failure can trigger an unbounded cascade of compaction retries that all fail identically.
Key changes
requireStructuralSignaloption onextractProviderAuthFailure()— only HTTP 401 status codes orerror.kind === "provider_auth"count as auth failures, preventing false positives from LLM summary text that happens to mention auth errorssummarize.tspatched — 3 userequireStructuralSignal: true(parsing LLM output where false positives are possible), 1 intentionally omits it (handling real thrown exceptions where structural signals are always present)Test plan
requireStructuralSignal: truerejects false positives from LLM text mentioning auth errorsrequireStructuralSignal) still trip the breaker🤖 Generated with Claude Code