What happens
BackgroundJobExecutionActor reads a child process's entire stdout and stderr into memory before anything is bounded, then holds that full string several times over. For a long-running, chatty job — which is exactly what background jobs are for (log tails, builds, watchers) — that's the same unbounded-allocation shape that #1293 just fixed for shell_execute, on a path arguably more prone to it.
The full output is materialized via ReadToEndAsync on both streams:
|
var stdoutTask = process.StandardOutput.ReadToEndAsync(); |
|
var stderrTask = process.StandardError.ReadToEndAsync(); |
then appended into a StringBuilder, redacted as one big string, written to disk in full, and shipped again inside the actor message — before the tail is finally sliced off:
|
var fullOutput = SecretOutputRedactor.Redact(outputBuilder.ToString()); |
|
|
|
try |
|
{ |
|
await File.WriteAllTextAsync(_outputLogPath, fullOutput); |
|
} |
|
catch // slopwatch-ignore: SW003 best-effort log write — output still delivered via actor message |
|
{ |
|
} |
|
|
|
self.Tell(new ProcessExited(process.ExitCode, fullOutput)); |
The MaxOutputTailChars trim that bounds the message runs only after all of that, so it bounds the message, not the heap.
Why it matters
On a memory-limited daemon (we run at 1Gi) a job that emits a few hundred MB spikes RSS the same way #1293 did, and the cgroup OOM-kills the process — restart count stays 0, so it's invisible to kubectl get pods. The agent can reach this path directly by launching a background job instead of a foreground shell_execute, so closing only the shell path leaves the door open.
Suggested direction
Stream the pipes to the on-disk log in bounded memory while keeping only a bounded tail in memory for the completion message — i.e. tee to disk + retain the last N chars, never holding the whole output as a managed string. This is the same "bound the read" lesson as #1293; ideally it shares one bounded-output reader with shell_execute rather than re-implementing the ring/window logic (see the extraction discussion under #1293).
The redaction step needs thought when streaming: today it redacts the full string in one pass. A streamed write wants redaction applied incrementally (line/overlap-based) so secrets are still scrubbed without buffering everything.
Related
#1293 (same allocation pattern, foreground shell path).
What happens
BackgroundJobExecutionActorreads a child process's entire stdout and stderr into memory before anything is bounded, then holds that full string several times over. For a long-running, chatty job — which is exactly what background jobs are for (log tails, builds, watchers) — that's the same unbounded-allocation shape that #1293 just fixed forshell_execute, on a path arguably more prone to it.The full output is materialized via
ReadToEndAsyncon both streams:netclaw/src/Netclaw.Actors/Jobs/BackgroundJobExecutionActor.cs
Lines 115 to 116 in 60601c6
then appended into a
StringBuilder, redacted as one big string, written to disk in full, and shipped again inside the actor message — before the tail is finally sliced off:netclaw/src/Netclaw.Actors/Jobs/BackgroundJobExecutionActor.cs
Lines 130 to 140 in 60601c6
The
MaxOutputTailCharstrim that bounds the message runs only after all of that, so it bounds the message, not the heap.Why it matters
On a memory-limited daemon (we run at 1Gi) a job that emits a few hundred MB spikes RSS the same way #1293 did, and the cgroup OOM-kills the process — restart count stays 0, so it's invisible to
kubectl get pods. The agent can reach this path directly by launching a background job instead of a foregroundshell_execute, so closing only the shell path leaves the door open.Suggested direction
Stream the pipes to the on-disk log in bounded memory while keeping only a bounded tail in memory for the completion message — i.e. tee to disk + retain the last N chars, never holding the whole output as a managed string. This is the same "bound the read" lesson as #1293; ideally it shares one bounded-output reader with
shell_executerather than re-implementing the ring/window logic (see the extraction discussion under #1293).The redaction step needs thought when streaming: today it redacts the full string in one pass. A streamed write wants redaction applied incrementally (line/overlap-based) so secrets are still scrubbed without buffering everything.
Related
#1293 (same allocation pattern, foreground shell path).