fix paper link#1
Merged
Merged
Conversation
Contributor
|
ty! |
billzhuang6569
pushed a commit
to billzhuang6569/lossless-claw
that referenced
this pull request
Mar 21, 2026
fix paper link
100yenadmin
referenced
this pull request
in 100yenadmin/lossless-claw
May 7, 2026
Opus subagent analysis of v4.1 baseline (333 blocks) vs v4.2 stubs (689 blocks) at the same 258K-token budget recommended four mitigations to address moderate-risk findings: 1. Recency cue [t-NNm] on turn headers 2. Semantic stub wrapping <lcm-stub> XML tags 3. Empty-assistant collapsing 4. Resolution markers at completion boundaries Applied first-principles-architectural-decision skill (research, run-the-system, where-it-lives diagrams, adversarial debate) before building any of them. Verdict: REJECT ALL FOUR. Each fails on a specific load-bearing constraint: - #1 fails on prefix-cache stability (clock-based tag changes the rendered string on every assemble, invalidating the cache that v4.2's whole value proposition relies on). User timestamps already exist inline. - #2 fails on "novelty has cost, format already works" — the existing [LCM Tool Output: file_xxx | …] bracket form is correctly parsed by Opus in live tests (drilldown via lcm_describe works on Option F format). Replacing a working v4.1-trained format with a novel XML form is unjustified churn. - #3 fails on Anthropic/OpenAI wire contract. The "empty assistants" contain tool_use blocks (required to live in assistant turns; paired with tool_results by toolCallId). Dropping them would break pairing — providers reject orphan tool_results. - Martian-Engineering#4 fails on detection signal. No reliable way to mark "work completed" — user phrases like "go ahead" / "yes" / "keep digging" oscillate. False positives are strictly worse than no marker (license premature stubbing). Adversarial debate at ≥95% confidence target on each. AGAINST won on all four. Decision record committed for future operators who hit similar moderate-risk findings and reach for similar mitigations. Final v4.2 shipping shape: Options C + D + F at commit e309bed. Architecturally additive, reversible, default-off. Empirically: 333→689 items at same budget; Opus drills down correctly; no confabulation observed.
100yenadmin
referenced
this pull request
in 100yenadmin/lossless-claw
May 7, 2026
Wire #1 of 3 for the agent context-management architecture (Wave-14). # What this lands Subscribes to the openclaw `llm_output` hook to maintain a per-session cache of (currentTokenCount, tokenBudget). Anchors via llm_output; extends via per-tool additive self-updates between LLM calls so parallel-tool-call sequences see accurate cumulative state, not stale ground-truth from the previous LLM call. Replaces lcm_compact's previously-vapor `getRuntimeContext` callback with the real cache. Floor-check (currentRatio < reserveFraction) finally works against live data instead of always-undefined. # Files NEW: - `src/plugin/token-state.ts` (~165 LOC) — recordLlmOutput + accumulateToolResultTokens + getRuntimeContext + inferTokenBudget - `test/v41-token-state.test.ts` (~110 LOC) — 16 tests covering anchoring, accumulation, drift reset, session isolation, budget inference EDITED: - `src/plugin/index.ts` — `api.on("llm_output", ...)` handler that records tokens; lcm_compact registration passes `getRuntimeContext: () => getTokenStateRuntimeContext(ctx.sessionKey)` - `src/tools/lcm-compact-tool.ts` — docstring on getRuntimeContext documents the wiring + tolerance for undefined fields # How it works Round 1 (anchor): LLM call N → llm_output fires → recordLlmOutput stores { currentTokenCount, tokenBudget, lastUpdateSource: "llm_output" } keyed by sessionKey Round 2 (parallel-tool-call protection): Tools fire sequentially between LLM calls. Each tool's execute() ends with accumulateToolResultTokens(sessionKey, resultText) which adds Math.ceil(resultText.length / 4) to currentTokenCount. This way a 5-tool batch from one LLM response sees accurate cumulative state at each tool, not the same stale value. Round 3 (drift reset): Next LLM call → llm_output snaps cache back to ground truth. Any per-tool estimation drift bounded by one iteration's batch. # Why this layer (vs. waiting for openclaw SDK addition) `OpenClawPluginToolContext` does not expose token state today. Wave-14 research confirmed (lossless-claw#472, openclaw#68930 closed NOT_PLANNED). The proper fix is an openclaw PR adding `getTokenState?: () => TokenSnapshot` to the factory context. That PR will be filed separately. This module is the LCM-side bridge that makes the architecture work TODAY without openclaw changes. Once openclaw lands the official accessor, this hook handler becomes legacy / fallback for older versions and the per-tool accumulator stays as a within-iteration lag-protection layer. # Verification - 1573/1573 tests passing (1557 baseline + 16 new) - 7/7 release-readiness preflight checks pass - 330 TS errors (under 700 baseline; PR introduced none)
100yenadmin
referenced
this pull request
in 100yenadmin/lossless-claw
May 7, 2026
Opus subagent analysis of v4.1 baseline (333 blocks) vs v4.2 stubs (689 blocks) at the same 258K-token budget recommended four mitigations to address moderate-risk findings: 1. Recency cue [t-NNm] on turn headers 2. Semantic stub wrapping <lcm-stub> XML tags 3. Empty-assistant collapsing 4. Resolution markers at completion boundaries Applied first-principles-architectural-decision skill (research, run-the-system, where-it-lives diagrams, adversarial debate) before building any of them. Verdict: REJECT ALL FOUR. Each fails on a specific load-bearing constraint: - #1 fails on prefix-cache stability (clock-based tag changes the rendered string on every assemble, invalidating the cache that v4.2's whole value proposition relies on). User timestamps already exist inline. - #2 fails on "novelty has cost, format already works" — the existing [LCM Tool Output: file_xxx | …] bracket form is correctly parsed by Opus in live tests (drilldown via lcm_describe works on Option F format). Replacing a working v4.1-trained format with a novel XML form is unjustified churn. - #3 fails on Anthropic/OpenAI wire contract. The "empty assistants" contain tool_use blocks (required to live in assistant turns; paired with tool_results by toolCallId). Dropping them would break pairing — providers reject orphan tool_results. - Martian-Engineering#4 fails on detection signal. No reliable way to mark "work completed" — user phrases like "go ahead" / "yes" / "keep digging" oscillate. False positives are strictly worse than no marker (license premature stubbing). Adversarial debate at ≥95% confidence target on each. AGAINST won on all four. Decision record committed for future operators who hit similar moderate-risk findings and reach for similar mitigations. Final v4.2 shipping shape: Options C + D + F at commit e309bed. Architecturally additive, reversible, default-off. Empirically: 333→689 items at same budget; Opus drills down correctly; no confabulation observed.
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://voltropy.com/LCM isn't a real link: the paper actually lives at https://papers.voltropy.com/LCM