Skip to content

ProcessRunner: add PROCESS_KILL_TREE option #5201

@aleks-f

Description

@aleks-f

Summary

On Windows, TerminateProcess() only kills the target process — it does not terminate child processes spawned by it. When ProcessRunner::stop() force-kills a process after timeout, any grandchild processes are orphaned and continue running. This can hold DLLs, ports, and file handles open indefinitely.

Reproduction

  1. Use ProcessRunner to launch a process that itself spawns child processes (e.g., a process supervisor that launches worker processes)
  2. Call ProcessRunner::stop() — it sends termination signal, waits, then calls TerminateProcess() after timeout
  3. The main process is killed, but its children survive as orphans
  4. The orphaned processes hold resources (TCP ports, DLL locks, file handles) preventing subsequent launches or cleanup

Root cause

ProcessRunner::stop() ultimately calls TerminateProcess() which only affects the single process identified by its handle. Windows has no built-in mechanism to propagate termination to child processes (unlike Unix process groups with kill(-pgid, sig)).

Proposed fix

Add a PROCESS_KILL_TREE option flag that, when set, creates a Windows Job Object with JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE and assigns the child process to it. When the Job Object handle is closed (either explicitly in stop() or implicitly when ProcessRunner is destroyed), the kernel automatically terminates all processes in the job — the child and all its descendants.

API

// New flag in Process.h (or ProcessRunner.h), next available bit:
static const int PROCESS_KILL_TREE = 0x10;

// Usage:
ProcessRunner pr("agent.exe", args,
    Process::PROCESS_CLOSE_STDOUT | Process::PROCESS_KILL_TREE);

Implementation (Windows)

In ProcessRunner::start(), after CreateProcess:

#if defined(POCO_OS_FAMILY_WINDOWS)
if (_options & PROCESS_KILL_TREE)
{
    _hJob = CreateJobObjectW(nullptr, nullptr);
    if (_hJob)
    {
        JOBOBJECT_EXTENDED_LIMIT_INFORMATION jeli = {};
        jeli.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE;
        SetInformationJobObject(_hJob, JobObjectExtendedLimitInformation, &jeli, sizeof(jeli));
        AssignProcessToJobObject(_hJob, _processHandle);
    }
}
#endif

In destructor / stop(), close _hJob after the process has exited (or to force-kill the tree).

Implementation (Unix)

On Unix, the equivalent is process groups:

  • In the child (after fork(), before exec()): call setpgid(0, 0) to create a new process group
  • In stop(): use kill(-pid, sig) to signal the entire process group instead of just the leader

This should also be gated on PROCESS_KILL_TREE to avoid changing default behavior.

Why opt-in (disabled by default)

Some legitimate patterns rely on grandchildren surviving the parent:

  • Daemonization (double-fork on Unix)
  • Service restart via detached helper process
  • Process handoff patterns

Making this opt-in preserves backward compatibility.

Affected files

  • platform/Foundation/include/Poco/Process.h — add PROCESS_KILL_TREE constant
  • platform/Foundation/include/Poco/ProcessRunner.h — add _hJob member (Windows)
  • platform/Foundation/src/ProcessRunner.cpp — Job Object creation in start(), cleanup in stop()/destructor
  • platform/Foundation/src/Process_UNIX.cpp — optional setpgid() in child, kill(-pgid) in requestTermination()

Note

This issue complements the NamedEvent race fix (#5199). Together they address the two main reliability problems with ProcessRunner on Windows: lost termination signals and orphaned grandchild processes.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions