Skip to content
This repository was archived by the owner on May 5, 2026. It is now read-only.

Commit 13e917e

Browse files
authored
fix: derive dynamic context-window guard thresholds
Derive context-window guard thresholds from the effective model window, keeping 10% hard-min and 20% warning ratios with 4k/8k floors. Stop the embedded runner from forcing old fixed guard overrides so runtime admission uses the dynamic resolver. Validation: - CI run 25151866833 passed, including build-artifacts and checks-node-channels. - Parity gate 25151866868 passed. - Testbox pnpm test:channels passed: 54 files / 433 tests. Fixes openclaw#42999. Prepared head SHA: 9c80383
1 parent f072145 commit 13e917e

8 files changed

Lines changed: 147 additions & 41 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ Docs: https://docs.openclaw.ai
4343
- Security/outbound: strip re-formed HTML tags during plain-text sanitization so nested tag fragments cannot leave a CodeQL-detected `<script>` sequence behind. Thanks @vincentkoc.
4444
- Security/secrets: compare credential bytes with padded timing-safe buffers instead of hashing candidate passwords before equality checks. Thanks @vincentkoc.
4545
- CLI/agents/status: keep `openclaw agents`, text `agents list`, and plain text `status` on read-only metadata paths so human output no longer preloads plugin runtimes or live channel scans before printing. Fixes #74195. Thanks @NianJiuZst.
46+
- Agents/local models: derive context-window guard thresholds from the effective model window with 4k/8k safety floors, so small local models are no longer rejected by fixed 16k/32k preflight cutoffs. Fixes #42999. Thanks @chengjialu8888.
4647
- Media: treat legacy Word/OLE attachments with `application/msword` or `application/x-cfb` MIME as binary so printable-looking `.doc` files are not embedded into prompts as text. Fixes #54176; carries forward #54380. Thanks @andyliu.
4748
- Config: accept documented `browser.tabCleanup` keys in strict root config validation, so configured tab cleanup no longer fails before runtime reads it. Fixes #74577. Thanks @lonexreb and @ezdlp.
4849
- Cron: validate disabled job schedule edits before persisting updates, so invalid cron changes no longer partially mutate stored jobs. Fixes #74459. Thanks @yfge.

docs/gateway/local-models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ Compatibility notes for stricter OpenAI-compatible backends:
319319
OpenClaw process RSS/heap snapshot in diagnostics. For LM Studio/Ollama
320320
memory pressure, match that timestamp against the server log or macOS crash /
321321
jetsam log to confirm whether the model server was killed.
322-
- OpenClaw warns when the detected context window is below **32k** and blocks below **16k**. If you hit that preflight, raise the server/model context limit or choose a larger model.
322+
- OpenClaw derives context-window preflight thresholds from the detected model window, or from the uncapped model window when `agents.defaults.contextTokens` lowers the effective window. It warns below 20% with an **8k** floor. Hard blocks use the 10% threshold with a **4k** floor, capped to the effective context window so oversized model metadata cannot reject an otherwise valid user cap. If you hit that preflight, raise the server/model context limit or choose a larger model.
323323
- Context errors? Lower `contextWindow` or raise your server limit.
324324
- OpenAI-compatible server returns `messages[].content ... expected a string`?
325325
Add `compat.requiresStringContent: true` on that model entry.

src/agents/context-window-guard.test.ts

Lines changed: 86 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ import {
66
evaluateContextWindowGuard,
77
formatContextWindowBlockMessage,
88
formatContextWindowWarningMessage,
9+
resolveContextWindowGuardThresholds,
910
resolveContextWindowInfo,
1011
} from "./context-window-guard.js";
1112

@@ -35,41 +36,43 @@ describe("context-window-guard", () => {
3536
} satisfies OpenClawConfig;
3637
}
3738

38-
it("blocks below 16k (model metadata)", () => {
39+
it("blocks below the hard-min floor (model metadata)", () => {
3940
const info = resolveContextWindowInfo({
4041
cfg: undefined,
4142
provider: "openrouter",
4243
modelId: "tiny",
43-
modelContextWindow: 8000,
44+
modelContextWindow: 3999,
4445
defaultTokens: 200_000,
4546
});
4647
const guard = evaluateContextWindowGuard({ info });
4748
expect(guard.source).toBe("model");
48-
expect(guard.tokens).toBe(8000);
49+
expect(guard.tokens).toBe(3999);
50+
expect(guard.hardMinTokens).toBe(4000);
51+
expect(guard.warnBelowTokens).toBe(8000);
4952
expect(guard.shouldWarn).toBe(true);
5053
expect(guard.shouldBlock).toBe(true);
5154
});
5255

53-
it("warns below 32k but does not block at 16k+", () => {
56+
it("warns below the warning floor but does not block at hard-min+", () => {
5457
const info = resolveContextWindowInfo({
5558
cfg: undefined,
5659
provider: "openai",
5760
modelId: "small",
58-
modelContextWindow: 24_000,
61+
modelContextWindow: 6_000,
5962
defaultTokens: 200_000,
6063
});
6164
const guard = evaluateContextWindowGuard({ info });
62-
expect(guard.tokens).toBe(24_000);
65+
expect(guard.tokens).toBe(6_000);
6366
expect(guard.shouldWarn).toBe(true);
6467
expect(guard.shouldBlock).toBe(false);
6568
});
6669

67-
it("does not warn at 32k+ (model metadata)", () => {
70+
it("does not warn at the warning floor (model metadata)", () => {
6871
const info = resolveContextWindowInfo({
6972
cfg: undefined,
7073
provider: "openai",
7174
modelId: "ok",
72-
modelContextWindow: 64_000,
75+
modelContextWindow: 8_000,
7376
defaultTokens: 200_000,
7477
});
7578
const guard = evaluateContextWindowGuard({ info });
@@ -78,7 +81,7 @@ describe("context-window-guard", () => {
7881
});
7982

8083
it("uses models.providers.*.models[].contextWindow when present", () => {
81-
const cfg = openRouterModelConfig({ contextWindow: 12_000 });
84+
const cfg = openRouterModelConfig({ contextWindow: 3_000 });
8285

8386
const info = resolveContextWindowInfo({
8487
cfg,
@@ -160,6 +163,10 @@ describe("context-window-guard", () => {
160163
});
161164
const guard = evaluateContextWindowGuard({ info });
162165
expect(info.source).toBe("agentContextTokens");
166+
expect(info.tokens).toBe(20_000);
167+
expect(info.referenceTokens).toBe(200_000);
168+
expect(guard.hardMinTokens).toBe(20_000);
169+
expect(guard.warnBelowTokens).toBe(40_000);
163170
expect(guard.shouldWarn).toBe(true);
164171
expect(guard.shouldBlock).toBe(false);
165172
});
@@ -193,25 +200,86 @@ describe("context-window-guard", () => {
193200
expect(guard.shouldBlock).toBe(false);
194201
});
195202

203+
it("normalizes invalid default context tokens to the warning floor", () => {
204+
const info = resolveContextWindowInfo({
205+
cfg: undefined,
206+
provider: "anthropic",
207+
modelId: "unknown",
208+
defaultTokens: Number.NaN,
209+
});
210+
const guard = evaluateContextWindowGuard({ info });
211+
expect(info).toEqual({ source: "default", tokens: 8_000 });
212+
expect(guard.shouldWarn).toBe(false);
213+
expect(guard.shouldBlock).toBe(false);
214+
});
215+
216+
it("blocks invalid guard token counts instead of silently passing", () => {
217+
const guard = evaluateContextWindowGuard({
218+
info: { tokens: Number.NaN, source: "model" },
219+
});
220+
expect(guard.tokens).toBe(0);
221+
expect(guard.hardMinTokens).toBe(4_000);
222+
expect(guard.warnBelowTokens).toBe(8_000);
223+
expect(guard.shouldWarn).toBe(true);
224+
expect(guard.shouldBlock).toBe(true);
225+
});
226+
196227
it("allows overriding thresholds", () => {
197228
const info = { tokens: 10_000, source: "model" as const };
198229
const guard = evaluateContextWindowGuard({
199230
info,
200231
warnBelowTokens: 12_000,
201232
hardMinTokens: 9_000,
202233
});
234+
expect(guard.hardMinTokens).toBe(9_000);
235+
expect(guard.warnBelowTokens).toBe(12_000);
203236
expect(guard.shouldWarn).toBe(true);
204237
expect(guard.shouldBlock).toBe(false);
205238
});
206239

207-
it("exports thresholds as expected", () => {
208-
expect(CONTEXT_WINDOW_HARD_MIN_TOKENS).toBe(16_000);
209-
expect(CONTEXT_WINDOW_WARN_BELOW_TOKENS).toBe(32_000);
240+
it("exports threshold floors as expected", () => {
241+
expect(CONTEXT_WINDOW_HARD_MIN_TOKENS).toBe(4_000);
242+
expect(CONTEXT_WINDOW_WARN_BELOW_TOKENS).toBe(8_000);
243+
});
244+
245+
it("derives percentage-based thresholds above the safe floors", () => {
246+
expect(resolveContextWindowGuardThresholds(1_000_000)).toEqual({
247+
hardMinTokens: 100_000,
248+
warnBelowTokens: 200_000,
249+
});
250+
expect(resolveContextWindowGuardThresholds(64_000)).toEqual({
251+
hardMinTokens: 6_400,
252+
warnBelowTokens: 12_800,
253+
});
254+
expect(resolveContextWindowGuardThresholds(Number.NaN)).toEqual({
255+
hardMinTokens: 4_000,
256+
warnBelowTokens: 8_000,
257+
});
258+
});
259+
260+
it("derives guard thresholds from the reference window when capped", () => {
261+
const guard = evaluateContextWindowGuard({
262+
info: { tokens: 150_000, referenceTokens: 1_000_000, source: "agentContextTokens" },
263+
});
264+
expect(guard.hardMinTokens).toBe(100_000);
265+
expect(guard.warnBelowTokens).toBe(200_000);
266+
expect(guard.shouldWarn).toBe(true);
267+
expect(guard.shouldBlock).toBe(false);
268+
});
269+
270+
it("does not let inflated reference metadata hard-block a valid effective cap", () => {
271+
const guard = evaluateContextWindowGuard({
272+
info: { tokens: 20_000, referenceTokens: 1_000_000_000, source: "agentContextTokens" },
273+
});
274+
expect(guard.hardMinTokens).toBe(20_000);
275+
expect(guard.warnBelowTokens).toBe(200_000_000);
276+
expect(guard.shouldWarn).toBe(true);
277+
expect(guard.shouldBlock).toBe(false);
210278
});
211279

212280
it("adds a local-model hint to warning messages for localhost endpoints", () => {
213281
const guard = evaluateContextWindowGuard({
214-
info: { tokens: 24_000, source: "model" },
282+
info: { tokens: 6_000, source: "model" },
215283
});
216284

217285
expect(
@@ -221,12 +289,12 @@ describe("context-window-guard", () => {
221289
guard,
222290
runtimeBaseUrl: "http://127.0.0.1:1234/v1",
223291
}),
224-
).toContain("local/self-hosted runs work best at 32000+ tokens");
292+
).toContain("local/self-hosted runs work best at 8000+ tokens");
225293
});
226294

227295
it("does not add local-model hints for generic custom endpoints", () => {
228296
const guard = evaluateContextWindowGuard({
229-
info: { tokens: 24_000, source: "model" },
297+
info: { tokens: 6_000, source: "model" },
230298
});
231299

232300
expect(
@@ -236,7 +304,7 @@ describe("context-window-guard", () => {
236304
guard,
237305
runtimeBaseUrl: "https://models.example.com/v1",
238306
}),
239-
).toBe("low context window: custom/hosted-proxy-model ctx=24000 (warn<32000) source=model");
307+
).toBe("low context window: custom/hosted-proxy-model ctx=6000 (warn<8000) source=model");
240308
});
241309

242310
it("adds a local-model hint to block messages for localhost endpoints", () => {
@@ -281,14 +349,14 @@ describe("context-window-guard", () => {
281349

282350
it("keeps block messages concise for public providers", () => {
283351
const guard = evaluateContextWindowGuard({
284-
info: { tokens: 8_000, source: "model" },
352+
info: { tokens: 3_000, source: "model" },
285353
});
286354

287355
expect(
288356
formatContextWindowBlockMessage({
289357
guard,
290358
runtimeBaseUrl: "https://api.openai.com/v1",
291359
}),
292-
).toBe(`Model context window too small (8000 tokens; source=model). Minimum is 16000.`);
360+
).toBe(`Model context window too small (3000 tokens; source=model). Minimum is 4000.`);
293361
});
294362
});

src/agents/context-window-guard.ts

Lines changed: 50 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,16 @@ import type { OpenClawConfig } from "../config/types.openclaw.js";
22
import { resolveProviderEndpoint } from "./provider-attribution.js";
33
import { findNormalizedProviderValue } from "./provider-id.js";
44

5-
export const CONTEXT_WINDOW_HARD_MIN_TOKENS = 16_000;
6-
export const CONTEXT_WINDOW_WARN_BELOW_TOKENS = 32_000;
5+
export const CONTEXT_WINDOW_HARD_MIN_TOKENS = 4_000;
6+
export const CONTEXT_WINDOW_WARN_BELOW_TOKENS = 8_000;
7+
export const CONTEXT_WINDOW_HARD_MIN_RATIO = 0.1;
8+
export const CONTEXT_WINDOW_WARN_BELOW_RATIO = 0.2;
79

810
export type ContextWindowSource = "model" | "modelsConfig" | "agentContextTokens" | "default";
911

1012
export type ContextWindowInfo = {
1113
tokens: number;
14+
referenceTokens?: number;
1215
source: ContextWindowSource;
1316
};
1417

@@ -43,25 +46,34 @@ export function resolveContextWindowInfo(params: {
4346
const fromModel =
4447
normalizePositiveInt(params.modelContextTokens) ??
4548
normalizePositiveInt(params.modelContextWindow);
49+
const defaultTokens =
50+
normalizePositiveInt(params.defaultTokens) ?? CONTEXT_WINDOW_WARN_BELOW_TOKENS;
4651
const baseInfo = fromModelsConfig
4752
? { tokens: fromModelsConfig, source: "modelsConfig" as const }
4853
: fromModel
4954
? { tokens: fromModel, source: "model" as const }
50-
: { tokens: Math.floor(params.defaultTokens), source: "default" as const };
55+
: { tokens: defaultTokens, source: "default" as const };
5156

5257
const capTokens = normalizePositiveInt(params.cfg?.agents?.defaults?.contextTokens);
5358
if (capTokens && capTokens < baseInfo.tokens) {
54-
return { tokens: capTokens, source: "agentContextTokens" };
59+
return { tokens: capTokens, referenceTokens: baseInfo.tokens, source: "agentContextTokens" };
5560
}
5661

5762
return baseInfo;
5863
}
5964

6065
export type ContextWindowGuardResult = ContextWindowInfo & {
66+
hardMinTokens: number;
67+
warnBelowTokens: number;
6168
shouldWarn: boolean;
6269
shouldBlock: boolean;
6370
};
6471

72+
export type ContextWindowGuardThresholds = {
73+
hardMinTokens: number;
74+
warnBelowTokens: number;
75+
};
76+
6577
export type ContextWindowGuardHint = {
6678
endpointClass: ReturnType<typeof resolveProviderEndpoint>["endpointClass"];
6779
likelySelfHosted: boolean;
@@ -77,13 +89,29 @@ export function resolveContextWindowGuardHint(params: {
7789
};
7890
}
7991

92+
export function resolveContextWindowGuardThresholds(
93+
contextWindowTokens: number,
94+
): ContextWindowGuardThresholds {
95+
const tokens = normalizePositiveInt(contextWindowTokens) ?? 0;
96+
return {
97+
hardMinTokens: Math.max(
98+
CONTEXT_WINDOW_HARD_MIN_TOKENS,
99+
Math.floor(tokens * CONTEXT_WINDOW_HARD_MIN_RATIO),
100+
),
101+
warnBelowTokens: Math.max(
102+
CONTEXT_WINDOW_WARN_BELOW_TOKENS,
103+
Math.floor(tokens * CONTEXT_WINDOW_WARN_BELOW_RATIO),
104+
),
105+
};
106+
}
107+
80108
export function formatContextWindowWarningMessage(params: {
81109
provider: string;
82110
modelId: string;
83111
guard: ContextWindowGuardResult;
84112
runtimeBaseUrl?: string | null;
85113
}): string {
86-
const base = `low context window: ${params.provider}/${params.modelId} ctx=${params.guard.tokens} (warn<${CONTEXT_WINDOW_WARN_BELOW_TOKENS}) source=${params.guard.source}`;
114+
const base = `low context window: ${params.provider}/${params.modelId} ctx=${params.guard.tokens} (warn<${params.guard.warnBelowTokens}) source=${params.guard.source}`;
87115
const hint = resolveContextWindowGuardHint({ runtimeBaseUrl: params.runtimeBaseUrl });
88116
if (!hint.likelySelfHosted) {
89117
return base;
@@ -102,7 +130,7 @@ export function formatContextWindowWarningMessage(params: {
102130
}
103131
return (
104132
`${base}; local/self-hosted runs work best at ` +
105-
`${CONTEXT_WINDOW_WARN_BELOW_TOKENS}+ tokens and may show weaker tool use or more compaction until the server/model context limit is raised`
133+
`${params.guard.warnBelowTokens}+ tokens and may show weaker tool use or more compaction until the server/model context limit is raised`
106134
);
107135
}
108136

@@ -112,7 +140,7 @@ export function formatContextWindowBlockMessage(params: {
112140
}): string {
113141
const base =
114142
`Model context window too small (${params.guard.tokens} tokens; ` +
115-
`source=${params.guard.source}). Minimum is ${CONTEXT_WINDOW_HARD_MIN_TOKENS}.`;
143+
`source=${params.guard.source}). Minimum is ${params.guard.hardMinTokens}.`;
116144
const hint = resolveContextWindowGuardHint({ runtimeBaseUrl: params.runtimeBaseUrl });
117145
if (!hint.likelySelfHosted) {
118146
return base;
@@ -129,7 +157,7 @@ export function formatContextWindowBlockMessage(params: {
129157
return (
130158
`${base} This looks like a local model endpoint. ` +
131159
`Raise the server/model context limit or choose a larger model. ` +
132-
`OpenClaw local/self-hosted runs work best at ${CONTEXT_WINDOW_WARN_BELOW_TOKENS}+ tokens.`
160+
`OpenClaw local/self-hosted runs work best at ${params.guard.warnBelowTokens}+ tokens.`
133161
);
134162
}
135163

@@ -138,16 +166,25 @@ export function evaluateContextWindowGuard(params: {
138166
warnBelowTokens?: number;
139167
hardMinTokens?: number;
140168
}): ContextWindowGuardResult {
169+
const normalizedTokens = normalizePositiveInt(params.info.tokens);
170+
const tokens = normalizedTokens ?? 0;
171+
const referenceTokens = normalizePositiveInt(params.info.referenceTokens) ?? tokens;
172+
const resolvedThresholds = resolveContextWindowGuardThresholds(referenceTokens);
141173
const warnBelow = Math.max(
142174
1,
143-
Math.floor(params.warnBelowTokens ?? CONTEXT_WINDOW_WARN_BELOW_TOKENS),
175+
Math.floor(params.warnBelowTokens ?? resolvedThresholds.warnBelowTokens),
176+
);
177+
const defaultHardMin = Math.min(
178+
resolvedThresholds.hardMinTokens,
179+
Math.max(tokens, CONTEXT_WINDOW_HARD_MIN_TOKENS),
144180
);
145-
const hardMin = Math.max(1, Math.floor(params.hardMinTokens ?? CONTEXT_WINDOW_HARD_MIN_TOKENS));
146-
const tokens = Math.max(0, Math.floor(params.info.tokens));
181+
const hardMin = Math.max(1, Math.floor(params.hardMinTokens ?? defaultHardMin));
147182
return {
148183
...params.info,
149184
tokens,
150-
shouldWarn: tokens > 0 && tokens < warnBelow,
151-
shouldBlock: tokens > 0 && tokens < hardMin,
185+
hardMinTokens: hardMin,
186+
warnBelowTokens: warnBelow,
187+
shouldWarn: !normalizedTokens || tokens < warnBelow,
188+
shouldBlock: !normalizedTokens || tokens < hardMin,
152189
};
153190
}

src/agents/pi-embedded-runner/run.overflow-compaction.harness.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -195,6 +195,8 @@ export const mockedEvaluateContextWindowGuard = vi.fn(() => ({
195195
shouldBlock: false,
196196
tokens: 200000,
197197
source: "model",
198+
hardMinTokens: 1000,
199+
warnBelowTokens: 5000,
198200
}));
199201
export const mockedResolveContextWindowInfo = vi.fn(() => ({
200202
tokens: 200000,
@@ -357,6 +359,8 @@ export function resetRunOverflowCompactionHarnessMocks(): void {
357359
shouldBlock: false,
358360
tokens: 200000,
359361
source: "model",
362+
hardMinTokens: 1000,
363+
warnBelowTokens: 5000,
360364
});
361365
mockedResolveContextWindowInfo.mockReset();
362366
mockedResolveContextWindowInfo.mockReturnValue({

0 commit comments

Comments
 (0)