fix(exec): kill child process tree on timeout to prevent orphaned tasks#378
Conversation
nikolasdehor
left a comment
There was a problem hiding this comment.
Solid PR — the approach (process groups on Unix, taskkill /T on Windows) is the canonical solution for orphaned child processes.
What I liked:
Setpgid: true+syscall.Kill(-pid, SIGKILL)is the textbook Go approach for process group cleanup ✓- Build tags (
!windows/windows) for platform separation are idiomatic ✓ - The regression test spawns
sleep 60and verifies it gets killed — well-structured ✓ - Already-exited process handling is correct (nil checks, discarded
ESRCHerrors) ✓ - Race between normal completion and timeout is handled correctly by the
select✓ errors.Isimprovement over==forcontext.DeadlineExceeded✓
Suggestions (non-blocking):
-
Double-kill interaction: The command is created with
exec.CommandContext, so Go's internal goroutine will also try to kill the direct child when the context expires. This is benign (killing an already-killed process is harmless), but slightly confusing. On Go 1.20+, you could usecmd.Cancelto hookterminateProcessTreedirectly into theCommandContextmachinery, eliminating the hybrid approach:cmd.Cancel = func() error { return terminateProcessTree(cmd) } cmd.WaitDelay = 2 * time.Second
-
terminateProcessTreealways returnsnil— either return meaningful errors or change to no return value, since callers always discard it with_ =. -
Consider adding a normal-completion test alongside the timeout test, to verify the
Start + Waitrefactor doesn't break the happy path. -
SIGTERM before SIGKILL: Currently goes straight to SIGKILL. For an exec tool where the command has already timed out this is defensible, but a brief SIGTERM grace period (e.g., 500ms) before SIGKILL would allow children to flush buffers and clean up temp files.
Approving — these are all improvements, not blockers.
|
@lunareed720 Solid fix! Killing the entire process group instead of just the parent shell is the right call to avoid orphaned tasks. The cross-platform handling with Setpgid on Unix and taskkill on Windows is thoughtful too. We're building the PicoClaw Dev Group on Discord for contributors to connect. If you're interested, drop an email to |
…ss-tree fix(exec): kill child process tree on timeout to prevent orphaned tasks
Summary
This fixes orphaned background processes when
execcommands time out.What changed
Setpgid: true)kill -SIGKILL -pgid)taskkill /T /F /PIDfallbackExecToolexecution flow toStart + Waitwith explicit timeout path and cleanupsleep) does not leave child process aliveWhy
Issue #311 reports that failed/timed-out shell commands keep running in background. This can leak resources and degrade long-running agent stability.
Testing
go test ./pkg/tools -run TestShellTool -count=1Closes #311