fix(core)#20941: reap orphaned descendant processes on PTY abort#21124
fix(core)#20941: reap orphaned descendant processes on PTY abort#21124spencer426 merged 12 commits intogoogle-gemini:mainfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where descendant processes spawned by a command within a PTY session could become orphaned and continue consuming system resources after the PTY session leader was terminated. The changes introduce robust process tree cleanup mechanisms for both Windows and Unix environments, ensuring that all child processes are properly terminated upon PTY abort, thereby preventing resource exhaustion. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses an important issue of orphaned descendant processes on PTY abort by ensuring a more robust process tree cleanup on both Windows and Unix. The changes involve adding taskkill on Windows and a fallback process group kill on Unix. The accompanying tests have been updated, but I've identified a duplicated test case that should be removed. Additionally, I've suggested a refactoring in the Windows process killing logic to eliminate code duplication and improve maintainability. Overall, this is a valuable fix for resource leak prevention.
There was a problem hiding this comment.
Code Review
This pull request modifies the killProcessGroup utility to prevent orphaned descendant processes when a PTY is aborted. For Windows, it ensures taskkill /f /t is always called to terminate the process tree, even when a PTY object is present. For Unix, in the fallback scenario where an initial process group kill fails, it adds a second attempt to kill the process group after the PTY's own kill method is invoked. New tests are added to verify these changes on both platforms. The changes appear to correctly implement the described logic for preventing orphaned processes. No high or critical severity issues were found during this review.
- Add vi.unstubAllEnvs() to afterEach to prevent env var test leakage - Split SystemRoot test into two hermetic vi.stubEnv variants: one with SystemRoot set (absolute path) and one without (bare binary) - Add taskkill spawn error fallback test (ENOENT -> pty.kill()) - Add Unix orphan group SIGKILL regression test after pty fallback - Replace forbidden as-any casts with as-unknown-as-ChildProcess - On Unix pty fallback path, also call process.kill(-pid, SIGKILL) to reap orphaned descendant processes, closing the same gap as google-gemini#21124 Addresses gemini-code-assist review comments on PR google-gemini#20945.
|
@mrpmohiburrahman , @spencer426 |
|
Update: Feedback Addressed I have pushed a new commit that addresses all points raised in the latest review. I have also verified these changes locally with unit tests and linting to ensure alignment with the project’s architectural standards. Key Changes: Architectural Alignment: Switched from direct child_process.spawn usage to the standardized spawnAsync utility in packages/core. This ensures consistent shell operation handling and better error reporting. Test Robustness: Verification: Thanks again for the detailed feedback! CC: @spencer426 @mrpmohiburrahman |
|
Looks like there are some merge conflicts please resolve. |
Previously, killProcessGroup would only call pty.kill() when a PTY was provided, which only terminates the PTY session leader (e.g., the bash wrapper). Nested background child processes spawned by the command would survive as orphans, silently consuming CPU/memory and eventually causing OS-level resource exhaustion. On Windows: After pty.kill(), now also invokes taskkill /pid /f /t to terminate the entire process tree including descendants. On Unix: When the initial process group kill fails and falls back to pty.kill(), now also attempts process.kill(-pid, SIGKILL) to reap any orphaned descendant processes. Fixes google-gemini#20941
e033259 to
31335f3
Compare
Final Update: Conflicts Resolved & Code VerifiedI have resolved the merge conflict in shellExecutionService.ts. I have successfully merged the new destroy() cleanup logic from main with the required await killProcessGroup logic to ensure robust process termination on macOS/Unix. Summary of Re-Verification:
The PR is now fully up-to-date with main and should be ready for final approval. |
…am merge The upstream merge introduced a new spawn options object to spawnAsync which caused the taskkill assertion to fail because it now passes options as a third argument instead of none. Update the assertion to match the new call signature. Also add missing await to ShellExecutionService.kill() call in the PTY destroy test to ensure proper async execution.
|
Hi @spencer426 @mrpmohiburrahman , Cleaned up all the issue. |
|
@manavmax Looks like there's still a linting issue. |
Fix indentation in union type definitions and empty callback formatting to pass CI prettier checks.
Resolved merge conflict in AppContainer.tsx by adopting the upstream Promise.all pattern for killing background shells.
|
Hi @spencer426, all issues have been addressed: Merge conflict resolved: Previous fixes (from earlier commits): Verification: Ready for review. Could you please rerun the CI checks? Thank you for your patience! |
|
Thank you @spencer426 for guiding me throughout all of this. |
…TY abort (google-gemini#21124) Co-authored-by: Spencer <spencertang@google.com>
…TY abort (google-gemini#21124) Co-authored-by: Spencer <spencertang@google.com>
…TY abort (google-gemini#21124) Co-authored-by: Spencer <spencertang@google.com>
…TY abort (google-gemini#21124) Co-authored-by: Spencer <spencertang@google.com>
…TY abort (google-gemini#21124) Co-authored-by: Spencer <spencertang@google.com>
Summary
Previously, killProcessGroup would only call pty.kill() when a PTY was provided, which only terminates the PTY session leader (e.g., the bash wrapper). Nested background child processes spawned by the command would survive as orphans, silently consuming CPU/memory and eventually causing OS-level resource exhaustion. This fix ensures complete process tree cleanup on PTY abort across both Windows and Unix.
Details
The fix adds descendant process reaping after pty.kill() on both platforms:
Two new regression tests were added, and one existing test was updated to validate the new behaviour.
Related Issues
Fixes #20941
How to Validate
npm test -w @google/gemini-cli-core -- src/utils/process-utils.test.tsAll 9 tests pass, including 2 new regression tests:
### Pre-Merge Checklist
- [ ✔] MacOS
- [ ✔] npm run
- [ ] npx
- [ ] Docker
- [ ] Podman
- [ ] Seatbelt
- [ ] Windows
- [ ] npm run
- [ ] npx
- [ ] Docker
- [ ] Linux
- [ ] npm run
- [ ] npx
- [ ] Docker