Skip to content

Commit 087dca8

Browse files
authored
fix(subagent): harden read-tool overflow guards and sticky reply threading (#19508)
* fix(gateway): avoid premature agent.wait completion on transient errors * fix(agent): preemptively guard tool results against context overflow * fix: harden tool-result context guard and add message_id metadata * fix: use importOriginal in session-key mock to include DEFAULT_ACCOUNT_ID The run.skill-filter test was mocking ../../routing/session-key.js with only buildAgentMainSessionKey and normalizeAgentId, but the module also exports DEFAULT_ACCOUNT_ID which is required transitively by src/web/auth-store.ts. Switch to importOriginal pattern so all real exports are preserved alongside the mocked functions. * pi-runner: guard accumulated tool-result overflow in transformContext * PI runner: compact overflowing tool-result context * Subagent: harden tool-result context recovery * Enhance tool-result context handling by adding support for legacy tool outputs and improving character estimation for message truncation. This includes a new function to create legacy tool results and updates to existing functions to better manage context overflow scenarios. * Enhance iMessage handling by adding reply tag support in send functions and tests. This includes modifications to prepend or rewrite reply tags based on provided replyToId, ensuring proper message formatting for replies. * Enhance message delivery across multiple channels by implementing sticky reply context for chunked messages. This includes preserving reply references in Discord, Telegram, and iMessage, ensuring that follow-up messages maintain their intended reply targets. Additionally, improve handling of reply tags in system prompts and tests to support consistent reply behavior. * Enhance read tool functionality by implementing auto-paging across chunks when no explicit limit is provided, scaling output budget based on model context window. Additionally, add tests for adaptive reading behavior and capped continuation guidance for large outputs. Update related functions to support these features. * Refine tool-result context management by stripping oversized read-tool details payloads during compaction, ensuring repeated read calls do not bypass context limits. Introduce new utility functions for handling truncation content and enhance character estimation for tool results. Add tests to validate the removal of excessive details in context overflow scenarios. * Refine message delivery logic in Matrix and Telegram by introducing a flag to track if a text chunk was sent. This ensures that replies are only marked as delivered when a text chunk has been successfully sent, improving the accuracy of reply handling in both channels. * fix: tighten reply threading coverage and prep fixes (#19508) (thanks @tyler6204)
1 parent 75e11fe commit 087dca8

40 files changed

Lines changed: 2109 additions & 217 deletions

CHANGELOG.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,9 @@ Docs: https://docs.openclaw.ai
88

99
- Agents/Subagents: add an accepted response note for `sessions_spawn` explaining polling subagents are disabled for one-off calls. Thanks @tyler6204.
1010
- Agents/Subagents: prefix spawned subagent task messages with context to preserve source information in downstream handling. Thanks @tyler6204.
11+
- iMessage: support `replyToId` on outbound text/media sends and normalize leading `[[reply_to:<id>]]` tags so replies target the intended iMessage. Thanks @tyler6204.
12+
- UI/Sessions: avoid duplicating typed session prefixes in display names (for example `Subagent Subagent ...`). Thanks @tyler6204.
13+
- Auto-reply/Prompts: include trusted inbound `message_id` in conversation metadata payloads for downstream targeting workflows. Thanks @tyler6204.
1114
- iOS/Talk: add a `Background Listening` toggle that keeps Talk Mode active while the app is backgrounded (off by default for battery safety). Thanks @zeulewan.
1215
- iOS/Talk: harden barge-in behavior by disabling interrupt-on-speech when output route is built-in speaker/receiver, reducing false interruptions from local TTS bleed-through. Thanks @zeulewan.
1316
- iOS/Talk: add a `Voice Directive Hint` toggle for Talk Mode prompts so users can disable ElevenLabs voice-switching instructions to save tokens when not needed. (#18250) Thanks @zeulewan.
@@ -42,6 +45,12 @@ Docs: https://docs.openclaw.ai
4245

4346
### Fixes
4447

48+
- Agents/Subagents: preemptively guard accumulated tool-result context before model calls by truncating oversized outputs and compacting oldest tool-result messages to avoid context-window overflow crashes. Thanks @tyler6204.
49+
- Agents/Subagents: add explicit subagent guidance to recover from `[compacted: tool output removed to free context]` / `[truncated: output exceeded context limit]` markers by re-reading with smaller chunks instead of full-file `cat`. Thanks @tyler6204.
50+
- Agents/Tools: make `read` auto-page across chunks (when no explicit `limit` is provided) and scale its per-call output budget from model `contextWindow`, so larger contexts can read more before context guards kick in. Thanks @tyler6204.
51+
- Agents/Tools: strip duplicated `read` truncation payloads from tool-result `details` and make pre-call context guarding account for heavy tool-result metadata, so repeated `read` calls no longer bypass compaction and overflow model context windows. Thanks @tyler6204.
52+
- Reply threading: keep reply context sticky across streamed/split chunks and preserve `replyToId` on all chunk sends across shared and channel-specific delivery paths (including iMessage, BlueBubbles, Telegram, Discord, and Matrix), so follow-up bubbles stay attached to the same referenced message. Thanks @tyler6204.
53+
- Gateway/Agent: defer transient lifecycle `error` snapshots with a short grace window so `agent.wait` does not resolve early during retry/failover. Thanks @tyler6204.
4554
- iOS/Onboarding: stop auth Step 3 retry-loop churn by pausing reconnect attempts on unauthorized/missing-token gateway errors and keeping auth/pairing issue state sticky during manual retry. (#19153) Thanks @mbelinky.
4655
- Voice-call: auto-end calls when media streams disconnect to prevent stuck active calls. (#18435) Thanks @JayMishra-source.
4756
- Voice call/Gateway: prevent overlapping closed-loop turn races with per-call turn locking, route transcript dedupe via source-aware fingerprints with strict cache eviction bounds, and harden `voicecall latency` stats for large logs without spread-operator stack overflow. (#19140) Thanks @mbelinky.
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
import { describe, expect, it, vi } from "vitest";
2+
import { imessagePlugin } from "./channel.js";
3+
4+
describe("imessagePlugin outbound", () => {
5+
const cfg = {
6+
channels: {
7+
imessage: {
8+
mediaMaxMb: 3,
9+
},
10+
},
11+
};
12+
13+
it("forwards replyToId on direct sendText adapter path", async () => {
14+
const sendIMessage = vi.fn().mockResolvedValue({ messageId: "m-text" });
15+
const sendText = imessagePlugin.outbound?.sendText;
16+
expect(sendText).toBeDefined();
17+
18+
const result = await sendText!({
19+
cfg,
20+
to: "chat_id:12",
21+
text: "hello",
22+
accountId: "default",
23+
replyToId: "reply-1",
24+
deps: { sendIMessage },
25+
});
26+
27+
expect(sendIMessage).toHaveBeenCalledWith(
28+
"chat_id:12",
29+
"hello",
30+
expect.objectContaining({
31+
accountId: "default",
32+
replyToId: "reply-1",
33+
maxBytes: 3 * 1024 * 1024,
34+
}),
35+
);
36+
expect(result).toEqual({ channel: "imessage", messageId: "m-text" });
37+
});
38+
39+
it("forwards replyToId on direct sendMedia adapter path", async () => {
40+
const sendIMessage = vi.fn().mockResolvedValue({ messageId: "m-media" });
41+
const sendMedia = imessagePlugin.outbound?.sendMedia;
42+
expect(sendMedia).toBeDefined();
43+
44+
const result = await sendMedia!({
45+
cfg,
46+
to: "chat_id:77",
47+
text: "caption",
48+
mediaUrl: "https://example.com/pic.png",
49+
accountId: "acct-1",
50+
replyToId: "reply-2",
51+
deps: { sendIMessage },
52+
});
53+
54+
expect(sendIMessage).toHaveBeenCalledWith(
55+
"chat_id:77",
56+
"caption",
57+
expect.objectContaining({
58+
mediaUrl: "https://example.com/pic.png",
59+
accountId: "acct-1",
60+
replyToId: "reply-2",
61+
maxBytes: 3 * 1024 * 1024,
62+
}),
63+
);
64+
expect(result).toEqual({ channel: "imessage", messageId: "m-media" });
65+
});
66+
});

extensions/imessage/src/channel.ts

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -183,7 +183,7 @@ export const imessagePlugin: ChannelPlugin<ResolvedIMessageAccount> = {
183183
chunker: (text, limit) => getIMessageRuntime().channel.text.chunkText(text, limit),
184184
chunkerMode: "text",
185185
textChunkLimit: 4000,
186-
sendText: async ({ cfg, to, text, accountId, deps }) => {
186+
sendText: async ({ cfg, to, text, accountId, deps, replyToId }) => {
187187
const send = deps?.sendIMessage ?? getIMessageRuntime().channel.imessage.sendMessageIMessage;
188188
const maxBytes = resolveChannelMediaMaxBytes({
189189
cfg,
@@ -195,10 +195,11 @@ export const imessagePlugin: ChannelPlugin<ResolvedIMessageAccount> = {
195195
const result = await send(to, text, {
196196
maxBytes,
197197
accountId: accountId ?? undefined,
198+
replyToId: replyToId ?? undefined,
198199
});
199200
return { channel: "imessage", ...result };
200201
},
201-
sendMedia: async ({ cfg, to, text, mediaUrl, accountId, deps }) => {
202+
sendMedia: async ({ cfg, to, text, mediaUrl, accountId, deps, replyToId }) => {
202203
const send = deps?.sendIMessage ?? getIMessageRuntime().channel.imessage.sendMessageIMessage;
203204
const maxBytes = resolveChannelMediaMaxBytes({
204205
cfg,
@@ -211,6 +212,7 @@ export const imessagePlugin: ChannelPlugin<ResolvedIMessageAccount> = {
211212
mediaUrl,
212213
maxBytes,
213214
accountId: accountId ?? undefined,
215+
replyToId: replyToId ?? undefined,
214216
});
215217
return { channel: "imessage", ...result };
216218
},
Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,131 @@
1+
import type { MatrixClient } from "@vector-im/matrix-bot-sdk";
2+
import type { PluginRuntime, RuntimeEnv } from "openclaw/plugin-sdk";
3+
import { beforeEach, describe, expect, it, vi } from "vitest";
4+
5+
const sendMessageMatrixMock = vi.hoisted(() => vi.fn().mockResolvedValue({ messageId: "mx-1" }));
6+
7+
vi.mock("../send.js", () => ({
8+
sendMessageMatrix: (...args: unknown[]) => sendMessageMatrixMock(...args),
9+
}));
10+
11+
import { setMatrixRuntime } from "../../runtime.js";
12+
import { deliverMatrixReplies } from "./replies.js";
13+
14+
describe("deliverMatrixReplies", () => {
15+
const loadConfigMock = vi.fn(() => ({}));
16+
const resolveMarkdownTableModeMock = vi.fn(() => "code");
17+
const convertMarkdownTablesMock = vi.fn((text: string) => text);
18+
const resolveChunkModeMock = vi.fn(() => "length");
19+
const chunkMarkdownTextWithModeMock = vi.fn((text: string) => [text]);
20+
21+
const runtimeStub = {
22+
config: {
23+
loadConfig: (...args: unknown[]) => loadConfigMock(...args),
24+
},
25+
channel: {
26+
text: {
27+
resolveMarkdownTableMode: (...args: unknown[]) => resolveMarkdownTableModeMock(...args),
28+
convertMarkdownTables: (...args: unknown[]) => convertMarkdownTablesMock(...args),
29+
resolveChunkMode: (...args: unknown[]) => resolveChunkModeMock(...args),
30+
chunkMarkdownTextWithMode: (...args: unknown[]) => chunkMarkdownTextWithModeMock(...args),
31+
},
32+
},
33+
logging: {
34+
shouldLogVerbose: () => false,
35+
},
36+
} as unknown as PluginRuntime;
37+
38+
const runtimeEnv: RuntimeEnv = {
39+
log: vi.fn(),
40+
error: vi.fn(),
41+
} as unknown as RuntimeEnv;
42+
43+
beforeEach(() => {
44+
vi.clearAllMocks();
45+
setMatrixRuntime(runtimeStub);
46+
chunkMarkdownTextWithModeMock.mockImplementation((text: string) => [text]);
47+
});
48+
49+
it("keeps replyToId on first reply only when replyToMode=first", async () => {
50+
chunkMarkdownTextWithModeMock.mockImplementation((text: string) => text.split("|"));
51+
52+
await deliverMatrixReplies({
53+
replies: [
54+
{ text: "first-a|first-b", replyToId: "reply-1" },
55+
{ text: "second", replyToId: "reply-2" },
56+
],
57+
roomId: "room:1",
58+
client: {} as MatrixClient,
59+
runtime: runtimeEnv,
60+
textLimit: 4000,
61+
replyToMode: "first",
62+
});
63+
64+
expect(sendMessageMatrixMock).toHaveBeenCalledTimes(3);
65+
expect(sendMessageMatrixMock.mock.calls[0]?.[2]).toEqual(
66+
expect.objectContaining({ replyToId: "reply-1", threadId: undefined }),
67+
);
68+
expect(sendMessageMatrixMock.mock.calls[1]?.[2]).toEqual(
69+
expect.objectContaining({ replyToId: "reply-1", threadId: undefined }),
70+
);
71+
expect(sendMessageMatrixMock.mock.calls[2]?.[2]).toEqual(
72+
expect.objectContaining({ replyToId: undefined, threadId: undefined }),
73+
);
74+
});
75+
76+
it("keeps replyToId on every reply when replyToMode=all", async () => {
77+
await deliverMatrixReplies({
78+
replies: [
79+
{
80+
text: "caption",
81+
mediaUrls: ["https://example.com/a.jpg", "https://example.com/b.jpg"],
82+
replyToId: "reply-media",
83+
audioAsVoice: true,
84+
},
85+
{ text: "plain", replyToId: "reply-text" },
86+
],
87+
roomId: "room:2",
88+
client: {} as MatrixClient,
89+
runtime: runtimeEnv,
90+
textLimit: 4000,
91+
replyToMode: "all",
92+
});
93+
94+
expect(sendMessageMatrixMock).toHaveBeenCalledTimes(3);
95+
expect(sendMessageMatrixMock.mock.calls[0]).toEqual([
96+
"room:2",
97+
"caption",
98+
expect.objectContaining({ mediaUrl: "https://example.com/a.jpg", replyToId: "reply-media" }),
99+
]);
100+
expect(sendMessageMatrixMock.mock.calls[1]).toEqual([
101+
"room:2",
102+
"",
103+
expect.objectContaining({ mediaUrl: "https://example.com/b.jpg", replyToId: "reply-media" }),
104+
]);
105+
expect(sendMessageMatrixMock.mock.calls[2]?.[2]).toEqual(
106+
expect.objectContaining({ replyToId: "reply-text" }),
107+
);
108+
});
109+
110+
it("suppresses replyToId when threadId is set", async () => {
111+
chunkMarkdownTextWithModeMock.mockImplementation((text: string) => text.split("|"));
112+
113+
await deliverMatrixReplies({
114+
replies: [{ text: "hello|thread", replyToId: "reply-thread" }],
115+
roomId: "room:3",
116+
client: {} as MatrixClient,
117+
runtime: runtimeEnv,
118+
textLimit: 4000,
119+
replyToMode: "all",
120+
threadId: "thread-77",
121+
});
122+
123+
expect(sendMessageMatrixMock).toHaveBeenCalledTimes(2);
124+
expect(sendMessageMatrixMock.mock.calls[0]?.[2]).toEqual(
125+
expect.objectContaining({ replyToId: undefined, threadId: "thread-77" }),
126+
);
127+
expect(sendMessageMatrixMock.mock.calls[1]?.[2]).toEqual(
128+
expect.objectContaining({ replyToId: undefined, threadId: "thread-77" }),
129+
);
130+
});
131+
});

extensions/matrix/src/matrix/monitor/replies.ts

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,10 @@ export async function deliverMatrixReplies(params: {
5353

5454
const shouldIncludeReply = (id?: string) =>
5555
Boolean(id) && (params.replyToMode === "all" || !hasReplied);
56+
const replyToIdForReply = shouldIncludeReply(replyToId) ? replyToId : undefined;
5657

5758
if (mediaList.length === 0) {
59+
let sentTextChunk = false;
5860
for (const chunk of core.channel.text.chunkMarkdownTextWithMode(
5961
text,
6062
chunkLimit,
@@ -66,13 +68,14 @@ export async function deliverMatrixReplies(params: {
6668
}
6769
await sendMessageMatrix(params.roomId, trimmed, {
6870
client: params.client,
69-
replyToId: shouldIncludeReply(replyToId) ? replyToId : undefined,
71+
replyToId: replyToIdForReply,
7072
threadId: params.threadId,
7173
accountId: params.accountId,
7274
});
73-
if (shouldIncludeReply(replyToId)) {
74-
hasReplied = true;
75-
}
75+
sentTextChunk = true;
76+
}
77+
if (replyToIdForReply && !hasReplied && sentTextChunk) {
78+
hasReplied = true;
7679
}
7780
continue;
7881
}
@@ -83,15 +86,15 @@ export async function deliverMatrixReplies(params: {
8386
await sendMessageMatrix(params.roomId, caption, {
8487
client: params.client,
8588
mediaUrl,
86-
replyToId: shouldIncludeReply(replyToId) ? replyToId : undefined,
89+
replyToId: replyToIdForReply,
8790
threadId: params.threadId,
8891
audioAsVoice: reply.audioAsVoice,
8992
accountId: params.accountId,
9093
});
91-
if (shouldIncludeReply(replyToId)) {
92-
hasReplied = true;
93-
}
9494
first = false;
9595
}
96+
if (replyToIdForReply && !hasReplied) {
97+
hasReplied = true;
98+
}
9699
}
97100
}

git-hooks/pre-commit

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,14 +18,25 @@ fi
1818

1919
# Security: avoid option-injection from malicious file names (e.g. "--all", "--force").
2020
# Robustness: NUL-delimited file list handles spaces/newlines safely.
21-
mapfile -d '' -t files < <(git diff --cached --name-only --diff-filter=ACMR -z)
21+
# Compatibility: use read loops instead of `mapfile` so this runs on macOS Bash 3.x.
22+
files=()
23+
while IFS= read -r -d '' file; do
24+
files+=("$file")
25+
done < <(git diff --cached --name-only --diff-filter=ACMR -z)
2226

2327
if [ "${#files[@]}" -eq 0 ]; then
2428
exit 0
2529
fi
2630

27-
mapfile -d '' -t lint_files < <(node "$FILTER_FILES" lint -- "${files[@]}")
28-
mapfile -d '' -t format_files < <(node "$FILTER_FILES" format -- "${files[@]}")
31+
lint_files=()
32+
while IFS= read -r -d '' file; do
33+
lint_files+=("$file")
34+
done < <(node "$FILTER_FILES" lint -- "${files[@]}")
35+
36+
format_files=()
37+
while IFS= read -r -d '' file; do
38+
format_files+=("$file")
39+
done < <(node "$FILTER_FILES" format -- "${files[@]}")
2940

3041
if [ "${#lint_files[@]}" -gt 0 ]; then
3142
"$RUN_NODE_TOOL" oxlint --type-aware --fix -- "${lint_files[@]}"

src/agents/pi-embedded-runner/compact.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -381,6 +381,7 @@ export async function compactEmbeddedPiSessionDirect(
381381
abortSignal: runAbortController.signal,
382382
modelProvider: model.provider,
383383
modelId,
384+
modelContextWindowTokens: model.contextWindow,
384385
modelAuthMode: resolveModelAuthMode(model.provider, params.config),
385386
});
386387
const tools = sanitizeToolsForGoogle({ tools: toolsRaw, provider });

src/agents/pi-embedded-runner/run/attempt.ts

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ import {
3030
listChannelSupportedActions,
3131
resolveChannelMessageToolHints,
3232
} from "../../channel-tools.js";
33+
import { DEFAULT_CONTEXT_TOKENS } from "../../defaults.js";
3334
import { resolveOpenClawDocsPath } from "../../docs-path.js";
3435
import { isTimeoutError } from "../../failover-error.js";
3536
import { resolveModelAuthMode } from "../../model-auth.js";
@@ -95,6 +96,7 @@ import {
9596
buildEmbeddedSystemPrompt,
9697
createSystemPromptOverride,
9798
} from "../system-prompt.js";
99+
import { installToolResultContextGuard } from "../tool-result-context-guard.js";
98100
import { splitSdkTools } from "../tool-split.js";
99101
import { describeUnknownError, mapThinkingLevel } from "../utils.js";
100102
import { flushPendingToolResultsAfterIdle } from "../wait-for-idle-before-flush.js";
@@ -313,6 +315,7 @@ export async function runEmbeddedAttempt(
313315
abortSignal: runAbortController.signal,
314316
modelProvider: params.model.provider,
315317
modelId: params.modelId,
318+
modelContextWindowTokens: params.model.contextWindow,
316319
modelAuthMode: resolveModelAuthMode(params.model.provider, params.config),
317320
currentChannelId: params.currentChannelId,
318321
currentThreadTs: params.currentThreadTs,
@@ -492,6 +495,7 @@ export async function runEmbeddedAttempt(
492495

493496
let sessionManager: ReturnType<typeof guardSessionManager> | undefined;
494497
let session: Awaited<ReturnType<typeof createAgentSession>>["session"] | undefined;
498+
let removeToolResultContextGuard: (() => void) | undefined;
495499
try {
496500
await repairSessionFileIfNeeded({
497501
sessionFile: params.sessionFile,
@@ -587,6 +591,15 @@ export async function runEmbeddedAttempt(
587591
throw new Error("Embedded agent session missing");
588592
}
589593
const activeSession = session;
594+
removeToolResultContextGuard = installToolResultContextGuard({
595+
agent: activeSession.agent,
596+
contextWindowTokens: Math.max(
597+
1,
598+
Math.floor(
599+
params.model.contextWindow ?? params.model.maxTokens ?? DEFAULT_CONTEXT_TOKENS,
600+
),
601+
),
602+
});
590603
const cacheTrace = createCacheTrace({
591604
cfg: params.config,
592605
env: process.env,
@@ -1251,6 +1264,7 @@ export async function runEmbeddedAttempt(
12511264
// flushPendingToolResults() fires while tools are still executing, inserting
12521265
// synthetic "missing tool result" errors and causing silent agent failures.
12531266
// See: https://github.com/openclaw/openclaw/issues/8643
1267+
removeToolResultContextGuard?.();
12541268
await flushPendingToolResultsAfterIdle({
12551269
agent: session?.agent,
12561270
sessionManager,

0 commit comments

Comments
 (0)