Skip to content

5199 process termination#5203

Merged
matejk merged 24 commits intomainfrom
5199-process-termination
Feb 13, 2026
Merged

5199 process termination#5203
matejk merged 24 commits intomainfrom
5199-process-termination

Conversation

@aleks-f
Copy link
Copy Markdown
Member

@aleks-f aleks-f commented Feb 11, 2026

No description provided.

aleks-f and others added 5 commits February 11, 2026 16:29
Note: Types.h change already present in upstream poco.
Only TestResult.cpp needed porting.
…ect #5201

On Windows, TerminateProcess only kills the target process — grandchildren
survive as orphans. PROCESS_KILL_TREE creates a Job Object with
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE so the entire process tree is torn down
when ProcessRunner is stopped or destroyed.
…s groups #5201

On fork, the child calls setpgid(0, 0) to create a new process group.
On stop, kill(-pid, sig) signals the entire group instead of just the
leader. Gated on PROCESS_KILL_TREE to preserve default behavior.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves process termination behavior across platforms, primarily by adding an option to terminate an entire spawned process tree and by addressing a Windows termination-event lifetime race.

Changes:

  • Add PROCESS_KILL_TREE process option and implement tree-termination support (Windows Job Objects, Unix process groups).
  • Adjust Windows ServerApplication::hasConsole() detection to avoid misclassifying interactive launches when stdout/stderr are closed.
  • Refine CppUnit cxxabi.h availability detection to avoid _MSC_VER toolchains (clang-cl in particular).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
Util/src/ServerApplication.cpp Expands console detection (console window + stdin/out/err handles) to avoid false “no console” when std handles are closed.
Foundation/src/Process_UNIX.cpp Adds setpgid(0, 0) for PROCESS_KILL_TREE to enable process-group based signaling.
Foundation/src/ProcessRunner.cpp Implements Windows Job Object assignment for kill-tree, and Unix group-based polling/kill logic; adjusts Windows termination signaling to keep the event alive.
Foundation/include/Poco/ProcessRunner.h Stores a Windows Job Object handle (_hJob) when PROCESS_KILL_TREE is enabled.
Foundation/include/Poco/ProcessOptions.h Introduces PROCESS_KILL_TREE and documents intended platform behavior.
CppUnit/src/TestResult.cpp Updates HAVE_CXXABI_H detection to exclude MSVC toolchains.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@aleks-f aleks-f requested a review from matejk February 12, 2026 06:52
@matejk
Copy link
Copy Markdown
Contributor

matejk commented Feb 12, 2026

Related ticket #5201 is actually a new feature that adds functionality and extends the API.

Shall we add new feature to a bugfix release?

@aleks-f
Copy link
Copy Markdown
Member Author

aleks-f commented Feb 12, 2026

Related ticket #5201 is actually a new feature that adds functionality and extends the API.

Shall we add new feature to a bugfix release?

It doesn't break anything, it is just one more bit in options, off by default. strictly speaking, we shouldn't, but we need it and I don't want to wait for next release

@matejk
Copy link
Copy Markdown
Contributor

matejk commented Feb 12, 2026

The name of the PR does not reflect the actual changes: it fixes multiple issies.

Copy link
Copy Markdown
Contributor

@matejk matejk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Process Termination Fixes

CI is green (all 88 checks pass, including ASAN/TSAN/UBSAN). The clang-cl fix in TestResult.cpp and the QNX SPAWN_SETGROUP/SPAWN_NEWPGROUP path look correct and clean.

Critical

  1. _hJob handle leaked on normal process exit — The Job Object is created in run() but only cleaned up in stop() inside the _pPH guard. If the child exits on its own, run() completes and nulls _pPH, so stop() skips the cleanup block. Handle leaks on every natural-exit cycle.

  2. Stale errno in kill(-pid, 0) polling loopThread::sleep(10) between iterations can modify errno. If kill() returns -1/ESRCH (group gone) but errno was clobbered to EPERM by sleep internals, the || errno == EPERM condition passes and the loop spins until timeout.

  3. Unchecked setpgid(0, 0) return — If it fails, the child continues in the parent's process group. Later kill(-pid, SIGINT) targets a nonexistent group (ESRCH), the polling loop exits immediately, and stop() thinks the tree is dead.

Important

  1. start() catch block doesn't close _hJob — Handle leak on startup failure.

  2. Windows stop() doesn't wait for child tree — Only waits for the main PID; closes the Job Object but returns immediately. Asymmetric with Unix behavior.

  3. kill(-pid, SIGINT) return unchecked — EPERM on the initial signal gets masked as a timeout.

  4. fork/setpgid race — Parent could call kill(-pid, ...) before child reaches setpgid(0, 0). Standard fix: also call setpgid(pid, pid) in parent after fork.

Minor

  1. No tests for PROCESS_KILL_TREE — The existing ProcessRunnerTest doesn't exercise the new option.

  2. Silent degradation on Windows — If Job Object creation fails, PROCESS_KILL_TREE is silently ignored with no log/warning.

  3. hasConsole() may over-detect — Piped handles (e.g. from a service manager) would cause a false positive. Consider GetFileType(h) == FILE_TYPE_CHAR to confirm it's a console device.

See inline comments for details.

Copy link
Copy Markdown
Contributor

@matejk matejk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review — Round 2 (post 3dc180d2)

All 5 issues from the first review have been addressed ✅ — HandleGuard RAII, stale errno, kill return check, setpgid failure handling, and hasConsole GetFileType. Good work.

Remaining / new issues below. See inline comments for details.

Critical

  1. Job Object setup fails silently — KILL_TREE degrades to no-op — The entire CreateJobObjectWSetInformationJobObjectOpenProcessAssignProcessToJobObject chain can fail at any stage (most commonly AssignProcessToJobObject returns ERROR_ACCESS_DENIED when the child is already in a non-breakaway job — common in CI, containers, and service managers). Every failure path just calls _hJob.reset() and continues. No log, no warning, no exception. The user explicitly opts in to tree-kill but the feature silently becomes a no-op.

Important

  1. start() catch block: Process::kill() can throw, corrupting state — In the catch-all at lines 421-433, Process::kill(p) is called unguarded. It can throw NotFoundException/SystemException (see Process_WIN32U.cpp/Process_UNIX.cpp). If it does, _t.join() and _started.store(false) are never reached. The ProcessRunner is stuck in a state where _started==true but the thread is orphaned — subsequent start() always throws InvalidAccessException.

  2. Windows stop() sends NamedEvent unconditionally when KILL_TREE is active — asymmetric with Unix — On Unix, stop() branches on PROCESS_KILL_TREE to signal the group vs individual process. On Windows, the NamedEvent graceful signal is always sent (regardless of KILL_TREE), and tree termination relies on closing _hJob later. This means: (a) the NamedEvent triggers graceful shutdown in the leader, which may start cleanup — then _hJob close forcefully terminates the tree mid-cleanup; (b) the NamedEvent only works for Poco ServerApplication-based processes — non-Poco children get no graceful signal at all.

  3. _exit(73) magic number is opaque to the parent — If setpgid(0,0) fails in the child, exit code 73 is returned. The parent sees return=73 in the error string with no indication this was an infrastructure failure vs the application itself returning 73. The actual errno from setpgid is lost.

  4. Documentation/code mismatch in ProcessOptions.h — The doc says the termination request is sent to the main process, but the code sends kill(-pid, SIGINT) to the entire process group. The doc should match the actual group-wide SIGINT behavior.

Other

  1. No tests for PROCESS_KILL_TREE — ProcessRunnerTest has zero test cases exercising the new option. No coverage for Job Object creation, process group setup, or tree termination on stop.

  2. HandleGuard design concerns — Public handle member bypasses the RAII invariant (code already accesses _hJob.handle directly). reset() has no self-assignment guard (reset(handle) would close then lose the resource). Inconsistent with ProcessHandleImpl in the same codebase which uses a private _hProcess + process() accessor.

@aleks-f aleks-f requested a review from Copilot February 13, 2026 08:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
else
{
Poco::format(jobErr, "CreateJobObjectW: %s", Error::getMessage(Error::last()));
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the previous comment - verify the correct signature for Poco::format. The typical pattern is either format(output, formatString, args...) or output = format(formatString, args...).

Copilot uses AI. Check for mistakes.
{
if (!AssignProcessToJobObject(_hJob.handle(), hProc))
{
Poco::format(jobErr, "AssignProcessToJobObject: %s", Error::getMessage(Error::last()));
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same Poco::format signature issue as above. Ensure consistent usage throughout this code block.

Copilot uses AI. Check for mistakes.
@matejk matejk merged commit e377bda into main Feb 13, 2026
100 checks passed
@matejk matejk deleted the 5199-process-termination branch February 13, 2026 13:47
matejk pushed a commit that referenced this pull request Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants