What happens
shell_execute reads the entire child process stdout and stderr into memory before any size limit is applied. A command that emits a few hundred MB — kubectl logs, journalctl, curl of a large API response, cat of a big file — allocates that whole payload (and then some) on the Large Object Heap in one shot. In a memory-limited container that is an instant OOM, even though the model never sees more than a few KB of the output.
This is the most likely trigger for OOM kills on an autonomous agent that runs shell commands to pull logs.
Why
The drain reads to end with no cap:
|
var stdoutTask = process.StandardOutput.ReadToEndAsync(CancellationToken.None); |
|
var stderrTask = process.StandardError.ReadToEndAsync(CancellationToken.None); |
Then the result is assembled and copied again, redacted (another copy), and only then truncated:
|
outputBuilder.Append(await stdoutTask); |
|
errorBuilder.Append(await stderrTask); |
|
|
|
var result = new StringBuilder(); |
|
if (outputBuilder.Length > 0) |
|
result.Append(outputBuilder); |
|
if (errorBuilder.Length > 0) |
|
{ |
|
if (result.Length > 0) |
|
result.AppendLine(); |
|
result.Append(errorBuilder); |
|
} |
|
|
|
var sanitized = SecretOutputRedactor.Redact(result.ToString()); |
|
var output = TruncateOutput(sanitized, _config.MaxOutputChars); |
So for an N-byte output you transiently hold several multiples of N live at once: the ReadToEndAsync string, the StringBuilder append, the result.ToString(), and the redactor's output — all before TruncateOutput cuts it down. Strings are UTF-16, so an N-byte ASCII log is a 2N-byte .NET string, and anything over 85KB lands on the LOH (which doesn't compact, so the segments linger). A single ~300MB log can momentarily need over 1GB.
The MaxOutputChars cap (default 32000) protects the model's context, not the process. It runs last:
|
internal static string TruncateOutput(string output, int maxChars) |
There's even a second cap downstream — ClampToolResult clamps again before the result enters session history — so the model is doubly protected while the process is not protected at all:
|
public static string ClampToolResult(string resultText, int maxInlineToolResultChars) |
|
{ |
|
if (maxInlineToolResultChars <= 0 || resultText.Length <= maxInlineToolResultChars) |
|
return resultText; |
|
|
|
var omittedChars = resultText.Length - maxInlineToolResultChars; |
|
return resultText[..maxInlineToolResultChars] |
|
+ $"\n[tool result truncated: omitted {omittedChars} chars to protect context window]"; |
|
} |
Suggested direction
- Bound the read instead of reading to end. Drain into a fixed-size buffer (something like
MaxOutputChars plus a margin) and stop once it's full — kill the process or keep discarding so the pipe doesn't deadlock. We already kill on timeout, so the kill path exists.
- Once the read is bounded, collapse the copies: truncate first, then redact the small result, then build the final string. No reason to redact or
ToString a 300MB buffer we're about to throw away.
- Consider capturing a head+tail window rather than head-only, since the exit summary often matters — but that's secondary to bounding the allocation.
Notes
This pairs with the GC default issue (Server GC in a small container amplifies the spike and is slow to give the memory back). Bounding the read is the real fix; the GC change just widens the margin.
What happens
shell_executereads the entire child process stdout and stderr into memory before any size limit is applied. A command that emits a few hundred MB —kubectl logs,journalctl,curlof a large API response,catof a big file — allocates that whole payload (and then some) on the Large Object Heap in one shot. In a memory-limited container that is an instant OOM, even though the model never sees more than a few KB of the output.This is the most likely trigger for OOM kills on an autonomous agent that runs shell commands to pull logs.
Why
The drain reads to end with no cap:
netclaw/src/Netclaw.Actors/Tools/ShellTool.cs
Lines 147 to 148 in 60601c6
Then the result is assembled and copied again, redacted (another copy), and only then truncated:
netclaw/src/Netclaw.Actors/Tools/ShellTool.cs
Lines 196 to 210 in 60601c6
So for an N-byte output you transiently hold several multiples of N live at once: the
ReadToEndAsyncstring, theStringBuilderappend, theresult.ToString(), and the redactor's output — all beforeTruncateOutputcuts it down. Strings are UTF-16, so an N-byte ASCII log is a 2N-byte .NET string, and anything over 85KB lands on the LOH (which doesn't compact, so the segments linger). A single ~300MB log can momentarily need over 1GB.The
MaxOutputCharscap (default 32000) protects the model's context, not the process. It runs last:netclaw/src/Netclaw.Actors/Tools/ShellTool.cs
Line 215 in 60601c6
There's even a second cap downstream —
ClampToolResultclamps again before the result enters session history — so the model is doubly protected while the process is not protected at all:netclaw/src/Netclaw.Actors/Sessions/Pipelines/SessionToolExecutionPipeline.cs
Lines 979 to 987 in 60601c6
Suggested direction
MaxOutputCharsplus a margin) and stop once it's full — kill the process or keep discarding so the pipe doesn't deadlock. We already kill on timeout, so the kill path exists.ToStringa 300MB buffer we're about to throw away.Notes
This pairs with the GC default issue (Server GC in a small container amplifies the spike and is slow to give the memory back). Bounding the read is the real fix; the GC change just widens the margin.