Skip to content

fix(task): revert process group changes that cause hangs with nested mise tasks#8347

Merged
jdx merged 2 commits intomainfrom
fix/revert-process-groups
Feb 25, 2026
Merged

fix(task): revert process group changes that cause hangs with nested mise tasks#8347
jdx merged 2 commits intomainfrom
fix/revert-process-groups

Conversation

@jdx
Copy link
Owner

@jdx jdx commented Feb 25, 2026

Summary

Root Cause

The process group change created a hierarchy where nested mise run calls each placed their children in separate process groups:

mise run test  (pgid=terminal)
  └── sh -c "npm test"  (pgid=A, set by outer mise's setpgid)
      └── playwright
          └── mise run serve  (pgid=C, set by Playwright's detached:true)
              └── sh -c "npm run serve"  (pgid=D, set by inner mise's setpgid)
                  └── node main.js  (inherits pgid=D)

When Playwright kills pgid=C (kill(-pid, SIGKILL)), only processes in that group die. The actual server in pgid=D survives because SIGKILL cannot be caught/forwarded, and the surviving process holds pipe file descriptors open, preventing EOF and causing the parent mise to hang indefinitely.

Fixes #8345

Test plan

  • cargo build compiles cleanly
  • mise run lint passes
  • mise run test:unit — all 491 tests pass
  • mise run test:e2e test_task_timeout — timeout tests pass

🤖 Generated with Claude Code


Note

Medium Risk
Moderate risk because it changes how signals/timeouts propagate to subprocess trees on Unix, which can affect cleanup and Ctrl-C/termination semantics for running tasks.

Overview
Stops creating per-command process groups on Unix by removing the pre_exec setpgid hook, restoring the prior behavior where child commands stay in the caller’s process group.

Replaces all killpg usage with direct kill calls: timeout termination (SIGTERM then SIGKILL), global shutdown via CmdLineRunner::kill_all, and signal forwarding during execute now target only the child PID (and it no longer forwards SIGINT).

Written by Cursor Bugbot for commit e6ed47d. This will update automatically on new commits. Configure here.

…mise tasks

Reverts the setpgid/killpg changes from #8279 and #8301. Placing each
child process in its own process group via setpgid(0,0) breaks external
tools (e.g. Playwright) that rely on process-group-based kills to tear
down process trees. When Playwright SIGKILLs the webServer process
group, grandchild processes in a separate group (created by inner mise's
setpgid) survive, holding pipes open and causing the parent to hang
indefinitely.

Fixes #8345

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @jdx, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where recent changes to process group management were causing hangs in nested mise tasks. By reverting these changes, the PR restores the previous behavior and resolves compatibility issues with tools like Playwright that rely on process group signaling.

Highlights

Changelog
  • src/cmd.rs
    • Reverted process group changes to fix hangs with nested mise tasks.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request reverts process group management changes to fix issues with nested tasks. The changes primarily involve removing setpgid calls and replacing killpg with kill to target specific process IDs. The modifications appear correct and consistent with the goal of the pull request. I've identified one potential issue in the signal forwarding logic where an error from a kill call could be incorrectly propagated, and I've provided a suggestion to improve its robustness.

src/cmd.rs Outdated
debug!("Received signal {sig}, forwarding to {id}");
let pid = nix::unistd::Pid::from_raw(id as i32);
let sig = nix::sys::signal::Signal::try_from(sig).unwrap();
nix::sys::signal::kill(pid, sig)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using the ? operator here will cause execute() to return an error if kill fails. This can happen for benign reasons, such as the child process having already exited. This would incorrectly report a failure for the command execution.

To make this more robust and align with the behavior in TimeoutGuard, it's better to ignore the result of the kill call. The process we're trying to signal might already be gone, and that shouldn't be considered an error in this context.

Suggested change
nix::sys::signal::kill(pid, sig)?;
let _ = nix::sys::signal::kill(pid, sig);

@greptile-apps
Copy link

greptile-apps bot commented Feb 25, 2026

Greptile Summary

Reverted setpgid/killpg process group changes from PRs #8279 and #8301 to fix hangs with external tools like Playwright. The original changes placed each child process in its own process group, which broke tools that rely on process-group-based kills to tear down process trees. When Playwright killed its webServer process group with SIGKILL, grandchild processes in separate groups (created by nested mise calls) survived, held pipes open, and caused indefinite hangs.

Key changes:

  • Removed setpgid(0,0) call in pre_exec that created separate process groups
  • Changed back from killpg (process group kill) to kill (single process kill) in timeout handling and signal forwarding
  • Removed TTY detection logic that was conditionally applying process groups
  • Restored SIGINT filtering (skip forwarding SIGINT to avoid duplicate signals)

Trade-off: This revert sacrifices complete grandchild process cleanup (the original goal of #8279) to maintain compatibility with external tools that manage their own process groups. The issue with nested mise tasks and Playwright is prioritized over the grandchild orphaning problem.

Confidence Score: 4/5

  • This PR is safe to merge with minimal risk - it's a clean revert to previous working behavior
  • Score reflects that this is a well-understood revert with clear justification. The changes are internally consistent and return to previously stable code. The only concern is the known trade-off: grandchild processes may not be cleaned up as thoroughly (the original issue from before PR fix(task): use process groups to kill child process trees on Unix #8279), but this is an accepted compromise for Playwright compatibility
  • No files require special attention - the revert is clean and complete

Important Files Changed

Filename Overview
src/cmd.rs Reverts process group changes to fix Playwright/external tool hangs; returns to single-process kill approach

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["mise run test<br/>(terminal pgid)"] --> B["sh -c 'npm test'<br/>(pgid A - WITH setpgid)"]
    B --> C["playwright"]
    C --> D["mise run serve<br/>(pgid C - detached)"]
    D --> E["sh -c 'npm run serve'<br/>(pgid D - WITH setpgid)"]
    E --> F["node main.js<br/>(inherits pgid D)"]
    
    C -.->|"kill(-pid, SIGKILL)<br/>targets pgid C only"| G["PROBLEM: pgid D survives<br/>pipes stay open<br/>parent hangs"]
    
    F -.->|"survives because<br/>in different pgid"| G
    
    style G fill:#ff9999
    style F fill:#ffcccc
    style E fill:#ffcccc
Loading

Last reviewed commit: ca35256

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

@github-actions
Copy link

github-actions bot commented Feb 25, 2026

Hyperfine Performance

mise x -- echo

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.2.20 x -- echo 23.9 ± 0.4 23.4 28.0 1.02 ± 0.03
mise x -- echo 23.3 ± 0.5 22.7 30.0 1.00

mise env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.2.20 env 22.8 ± 0.6 22.1 29.6 1.00 ± 0.03
mise env 22.8 ± 0.3 22.1 24.8 1.00

mise hook-env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.2.20 hook-env 23.2 ± 0.4 22.7 27.7 1.00
mise hook-env 23.5 ± 0.5 22.8 29.2 1.01 ± 0.03

mise ls

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.2.20 ls 20.9 ± 0.3 20.2 22.6 1.00
mise ls 21.2 ± 0.5 20.5 24.3 1.01 ± 0.03

xtasks/test/perf

Command mise-2026.2.20 mise Variance
install (cached) 149ms 150ms +0%
ls (cached) 82ms 83ms -1%
bin-paths (cached) 88ms 87ms +1%
task-ls (cached) 825ms 823ms +0%

…esses

The child may have already exited (killed by TimeoutGuard or naturally)
when a signal arrives, causing kill() to return ESRCH. Use `let _ =`
instead of `?` to avoid spurious task failures from this race.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jdx jdx merged commit 26ac453 into main Feb 25, 2026
48 of 51 checks passed
@jdx jdx deleted the fix/revert-process-groups branch February 25, 2026 20:00
mise-en-dev added a commit that referenced this pull request Feb 26, 2026
### 🐛 Bug Fixes

- **(exec)** respect PATH order for virtualenv resolution in mise x by
@jdx in [#8342](#8342)
- **(task)** revert process group changes that cause hangs with nested
mise tasks by @jdx in [#8347](#8347)
- **(task)** resolve vars from subdirectory configs for monorepo tasks
by @jdx in [#8343](#8343)
- **(task)** resolve dependencies before prepare to fix monorepo glob
deps by @jdx in [#8353](#8353)
- python noarch with Conda backend by @wolfv in
[#8349](#8349)

### New Contributors

- @wolfv made their first contribution in
[#8349](#8349)

## 📦 Aqua Registry Updates

#### New Packages (3)

- [`alexhallam/tv`](https://github.com/alexhallam/tv)
- [`arcanist-sh/hx`](https://github.com/arcanist-sh/hx)
- [`dathere/qsv`](https://github.com/dathere/qsv)

#### Updated Packages (3)

- [`astral-sh/ruff`](https://github.com/astral-sh/ruff)
- [`caarlos0/fork-cleaner`](https://github.com/caarlos0/fork-cleaner)
- [`rhysd/actionlint`](https://github.com/rhysd/actionlint)
risu729 pushed a commit to risu729/mise that referenced this pull request Feb 27, 2026
…mise tasks (jdx#8347)

## Summary
- Reverts the `setpgid`/`killpg` changes from jdx#8279 and jdx#8301
- Placing each child process in its own process group via `setpgid(0,0)`
breaks external tools (e.g. Playwright) that rely on process-group-based
kills to tear down process trees
- When Playwright `SIGKILL`s the webServer process group, grandchild
processes in a separate group (created by inner mise's `setpgid`)
survive, holding pipes open and causing the parent to hang indefinitely

## Root Cause

The process group change created a hierarchy where nested `mise run`
calls each placed their children in separate process groups:

```
mise run test  (pgid=terminal)
  └── sh -c "npm test"  (pgid=A, set by outer mise's setpgid)
      └── playwright
          └── mise run serve  (pgid=C, set by Playwright's detached:true)
              └── sh -c "npm run serve"  (pgid=D, set by inner mise's setpgid)
                  └── node main.js  (inherits pgid=D)
```

When Playwright kills pgid=C (`kill(-pid, SIGKILL)`), only processes in
that group die. The actual server in pgid=D survives because `SIGKILL`
cannot be caught/forwarded, and the surviving process holds pipe file
descriptors open, preventing EOF and causing the parent mise to hang
indefinitely.

Fixes jdx#8345

## Test plan
- [x] `cargo build` compiles cleanly
- [x] `mise run lint` passes
- [x] `mise run test:unit` — all 491 tests pass
- [x] `mise run test:e2e test_task_timeout` — timeout tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Moderate risk because it changes how signals/timeouts propagate to
subprocess trees on Unix, which can affect cleanup and
Ctrl-C/termination semantics for running tasks.
> 
> **Overview**
> Stops creating per-command process groups on Unix by removing the
`pre_exec` `setpgid` hook, restoring the prior behavior where child
commands stay in the caller’s process group.
> 
> Replaces all `killpg` usage with direct `kill` calls: timeout
termination (`SIGTERM` then `SIGKILL`), global shutdown via
`CmdLineRunner::kill_all`, and signal forwarding during `execute` now
target only the child PID (and it no longer forwards `SIGINT`).
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e6ed47d. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
risu729 pushed a commit to risu729/mise that referenced this pull request Feb 27, 2026
### 🐛 Bug Fixes

- **(exec)** respect PATH order for virtualenv resolution in mise x by
@jdx in [jdx#8342](jdx#8342)
- **(task)** revert process group changes that cause hangs with nested
mise tasks by @jdx in [jdx#8347](jdx#8347)
- **(task)** resolve vars from subdirectory configs for monorepo tasks
by @jdx in [jdx#8343](jdx#8343)
- **(task)** resolve dependencies before prepare to fix monorepo glob
deps by @jdx in [jdx#8353](jdx#8353)
- python noarch with Conda backend by @wolfv in
[jdx#8349](jdx#8349)

### New Contributors

- @wolfv made their first contribution in
[jdx#8349](jdx#8349)

## 📦 Aqua Registry Updates

#### New Packages (3)

- [`alexhallam/tv`](https://github.com/alexhallam/tv)
- [`arcanist-sh/hx`](https://github.com/arcanist-sh/hx)
- [`dathere/qsv`](https://github.com/dathere/qsv)

#### Updated Packages (3)

- [`astral-sh/ruff`](https://github.com/astral-sh/ruff)
- [`caarlos0/fork-cleaner`](https://github.com/caarlos0/fork-cleaner)
- [`rhysd/actionlint`](https://github.com/rhysd/actionlint)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant