fix(task): revert process group changes that cause hangs with nested mise tasks#8347
fix(task): revert process group changes that cause hangs with nested mise tasks#8347
Conversation
…mise tasks Reverts the setpgid/killpg changes from #8279 and #8301. Placing each child process in its own process group via setpgid(0,0) breaks external tools (e.g. Playwright) that rely on process-group-based kills to tear down process trees. When Playwright SIGKILLs the webServer process group, grandchild processes in a separate group (created by inner mise's setpgid) survive, holding pipes open and causing the parent to hang indefinitely. Fixes #8345 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello @jdx, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where recent changes to process group management were causing hangs in nested Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request reverts process group management changes to fix issues with nested tasks. The changes primarily involve removing setpgid calls and replacing killpg with kill to target specific process IDs. The modifications appear correct and consistent with the goal of the pull request. I've identified one potential issue in the signal forwarding logic where an error from a kill call could be incorrectly propagated, and I've provided a suggestion to improve its robustness.
src/cmd.rs
Outdated
| debug!("Received signal {sig}, forwarding to {id}"); | ||
| let pid = nix::unistd::Pid::from_raw(id as i32); | ||
| let sig = nix::sys::signal::Signal::try_from(sig).unwrap(); | ||
| nix::sys::signal::kill(pid, sig)?; |
There was a problem hiding this comment.
Using the ? operator here will cause execute() to return an error if kill fails. This can happen for benign reasons, such as the child process having already exited. This would incorrectly report a failure for the command execution.
To make this more robust and align with the behavior in TimeoutGuard, it's better to ignore the result of the kill call. The process we're trying to signal might already be gone, and that shouldn't be considered an error in this context.
| nix::sys::signal::kill(pid, sig)?; | |
| let _ = nix::sys::signal::kill(pid, sig); |
Greptile SummaryReverted Key changes:
Trade-off: This revert sacrifices complete grandchild process cleanup (the original goal of #8279) to maintain compatibility with external tools that manage their own process groups. The issue with nested mise tasks and Playwright is prioritized over the grandchild orphaning problem. Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["mise run test<br/>(terminal pgid)"] --> B["sh -c 'npm test'<br/>(pgid A - WITH setpgid)"]
B --> C["playwright"]
C --> D["mise run serve<br/>(pgid C - detached)"]
D --> E["sh -c 'npm run serve'<br/>(pgid D - WITH setpgid)"]
E --> F["node main.js<br/>(inherits pgid D)"]
C -.->|"kill(-pid, SIGKILL)<br/>targets pgid C only"| G["PROBLEM: pgid D survives<br/>pipes stay open<br/>parent hangs"]
F -.->|"survives because<br/>in different pgid"| G
style G fill:#ff9999
style F fill:#ffcccc
style E fill:#ffcccc
Last reviewed commit: ca35256 |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
Hyperfine Performance
|
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.2.20 x -- echo |
23.9 ± 0.4 | 23.4 | 28.0 | 1.02 ± 0.03 |
mise x -- echo |
23.3 ± 0.5 | 22.7 | 30.0 | 1.00 |
mise env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.2.20 env |
22.8 ± 0.6 | 22.1 | 29.6 | 1.00 ± 0.03 |
mise env |
22.8 ± 0.3 | 22.1 | 24.8 | 1.00 |
mise hook-env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.2.20 hook-env |
23.2 ± 0.4 | 22.7 | 27.7 | 1.00 |
mise hook-env |
23.5 ± 0.5 | 22.8 | 29.2 | 1.01 ± 0.03 |
mise ls
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.2.20 ls |
20.9 ± 0.3 | 20.2 | 22.6 | 1.00 |
mise ls |
21.2 ± 0.5 | 20.5 | 24.3 | 1.01 ± 0.03 |
xtasks/test/perf
| Command | mise-2026.2.20 | mise | Variance |
|---|---|---|---|
| install (cached) | 149ms | 150ms | +0% |
| ls (cached) | 82ms | 83ms | -1% |
| bin-paths (cached) | 88ms | 87ms | +1% |
| task-ls (cached) | 825ms | 823ms | +0% |
…esses The child may have already exited (killed by TimeoutGuard or naturally) when a signal arrives, causing kill() to return ESRCH. Use `let _ =` instead of `?` to avoid spurious task failures from this race. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
### 🐛 Bug Fixes - **(exec)** respect PATH order for virtualenv resolution in mise x by @jdx in [#8342](#8342) - **(task)** revert process group changes that cause hangs with nested mise tasks by @jdx in [#8347](#8347) - **(task)** resolve vars from subdirectory configs for monorepo tasks by @jdx in [#8343](#8343) - **(task)** resolve dependencies before prepare to fix monorepo glob deps by @jdx in [#8353](#8353) - python noarch with Conda backend by @wolfv in [#8349](#8349) ### New Contributors - @wolfv made their first contribution in [#8349](#8349) ## 📦 Aqua Registry Updates #### New Packages (3) - [`alexhallam/tv`](https://github.com/alexhallam/tv) - [`arcanist-sh/hx`](https://github.com/arcanist-sh/hx) - [`dathere/qsv`](https://github.com/dathere/qsv) #### Updated Packages (3) - [`astral-sh/ruff`](https://github.com/astral-sh/ruff) - [`caarlos0/fork-cleaner`](https://github.com/caarlos0/fork-cleaner) - [`rhysd/actionlint`](https://github.com/rhysd/actionlint)
…mise tasks (jdx#8347) ## Summary - Reverts the `setpgid`/`killpg` changes from jdx#8279 and jdx#8301 - Placing each child process in its own process group via `setpgid(0,0)` breaks external tools (e.g. Playwright) that rely on process-group-based kills to tear down process trees - When Playwright `SIGKILL`s the webServer process group, grandchild processes in a separate group (created by inner mise's `setpgid`) survive, holding pipes open and causing the parent to hang indefinitely ## Root Cause The process group change created a hierarchy where nested `mise run` calls each placed their children in separate process groups: ``` mise run test (pgid=terminal) └── sh -c "npm test" (pgid=A, set by outer mise's setpgid) └── playwright └── mise run serve (pgid=C, set by Playwright's detached:true) └── sh -c "npm run serve" (pgid=D, set by inner mise's setpgid) └── node main.js (inherits pgid=D) ``` When Playwright kills pgid=C (`kill(-pid, SIGKILL)`), only processes in that group die. The actual server in pgid=D survives because `SIGKILL` cannot be caught/forwarded, and the surviving process holds pipe file descriptors open, preventing EOF and causing the parent mise to hang indefinitely. Fixes jdx#8345 ## Test plan - [x] `cargo build` compiles cleanly - [x] `mise run lint` passes - [x] `mise run test:unit` — all 491 tests pass - [x] `mise run test:e2e test_task_timeout` — timeout tests pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Moderate risk because it changes how signals/timeouts propagate to subprocess trees on Unix, which can affect cleanup and Ctrl-C/termination semantics for running tasks. > > **Overview** > Stops creating per-command process groups on Unix by removing the `pre_exec` `setpgid` hook, restoring the prior behavior where child commands stay in the caller’s process group. > > Replaces all `killpg` usage with direct `kill` calls: timeout termination (`SIGTERM` then `SIGKILL`), global shutdown via `CmdLineRunner::kill_all`, and signal forwarding during `execute` now target only the child PID (and it no longer forwards `SIGINT`). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit e6ed47d. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
### 🐛 Bug Fixes - **(exec)** respect PATH order for virtualenv resolution in mise x by @jdx in [jdx#8342](jdx#8342) - **(task)** revert process group changes that cause hangs with nested mise tasks by @jdx in [jdx#8347](jdx#8347) - **(task)** resolve vars from subdirectory configs for monorepo tasks by @jdx in [jdx#8343](jdx#8343) - **(task)** resolve dependencies before prepare to fix monorepo glob deps by @jdx in [jdx#8353](jdx#8353) - python noarch with Conda backend by @wolfv in [jdx#8349](jdx#8349) ### New Contributors - @wolfv made their first contribution in [jdx#8349](jdx#8349) ## 📦 Aqua Registry Updates #### New Packages (3) - [`alexhallam/tv`](https://github.com/alexhallam/tv) - [`arcanist-sh/hx`](https://github.com/arcanist-sh/hx) - [`dathere/qsv`](https://github.com/dathere/qsv) #### Updated Packages (3) - [`astral-sh/ruff`](https://github.com/astral-sh/ruff) - [`caarlos0/fork-cleaner`](https://github.com/caarlos0/fork-cleaner) - [`rhysd/actionlint`](https://github.com/rhysd/actionlint)
Summary
setpgid/killpgchanges from fix(task): use process groups to kill child process trees on Unix #8279 and fix(task): skip setpgid for TTY stdin to fix interactive tasks #8301setpgid(0,0)breaks external tools (e.g. Playwright) that rely on process-group-based kills to tear down process treesSIGKILLs the webServer process group, grandchild processes in a separate group (created by inner mise'ssetpgid) survive, holding pipes open and causing the parent to hang indefinitelyRoot Cause
The process group change created a hierarchy where nested
mise runcalls each placed their children in separate process groups:When Playwright kills pgid=C (
kill(-pid, SIGKILL)), only processes in that group die. The actual server in pgid=D survives becauseSIGKILLcannot be caught/forwarded, and the surviving process holds pipe file descriptors open, preventing EOF and causing the parent mise to hang indefinitely.Fixes #8345
Test plan
cargo buildcompiles cleanlymise run lintpassesmise run test:unit— all 491 tests passmise run test:e2e test_task_timeout— timeout tests pass🤖 Generated with Claude Code
Note
Medium Risk
Moderate risk because it changes how signals/timeouts propagate to subprocess trees on Unix, which can affect cleanup and Ctrl-C/termination semantics for running tasks.
Overview
Stops creating per-command process groups on Unix by removing the
pre_execsetpgidhook, restoring the prior behavior where child commands stay in the caller’s process group.Replaces all
killpgusage with directkillcalls: timeout termination (SIGTERMthenSIGKILL), global shutdown viaCmdLineRunner::kill_all, and signal forwarding duringexecutenow target only the child PID (and it no longer forwardsSIGINT).Written by Cursor Bugbot for commit e6ed47d. This will update automatically on new commits. Configure here.