Skip to content

fix(task): prevent deadlock when MISE_JOBS=1 with sub-task references#8976

Merged
jdx merged 4 commits intomainfrom
fix/task-hang-jobs-1
Apr 9, 2026
Merged

fix(task): prevent deadlock when MISE_JOBS=1 with sub-task references#8976
jdx merged 4 commits intomainfrom
fix/task-hang-jobs-1

Conversation

@jdx
Copy link
Copy Markdown
Owner

@jdx jdx commented Apr 9, 2026

Summary

  • Fixes deadlock when MISE_JOBS=1 and a task's run array contains both sub-task references ({ task = "foo" }) and scripts
  • The parent task acquires the single semaphore permit, then waits for the sub-task which also needs that permit — classic deadlock
  • Fix: temporarily release the parent's permit before inject_and_wait, re-acquire afterward

Fixes #8967

Test plan

  • Added e2e test test_run_subtask_jobs1 that reproduces the exact scenario from the discussion
  • Verified the test hangs without the fix and passes with it
  • Verified normal (non-MISE_JOBS=1) execution still works

🤖 Generated with Claude Code


Note

Medium Risk
Changes task scheduling/semaphore permit handling during nested sub-task execution; mistakes could impact task concurrency limits or cause hangs under certain job configurations.

Overview
Prevents a deadlock when running tasks with MISE_JOBS=1 whose run array mixes sub-task entries (e.g. { task = "foo" }) and script lines by releasing the parent task’s semaphore permit before inject_and_wait and re-acquiring it afterward.

This threads the scheduler Semaphore and the current OwnedSemaphorePermit through Run::run_task_sched into TaskExecutor::exec_task_run_entries so nested sub-task execution can yield the permit while waiting. Adds an e2e regression test test_run_subtask_jobs1 to reproduce the hang and verify correct output ordering.

Reviewed by Cursor Bugbot for commit cb780c0. Bugbot is set up for automated code reviews on this repo. Configure here.

When a task contains both sub-task references ({ task = "foo" }) and
scripts in its run array, it acquires a semaphore permit. With MISE_JOBS=1,
the sub-task also needs the single available permit, causing a deadlock.

Fix by temporarily releasing the parent's permit before inject_and_wait
and re-acquiring it afterward for subsequent script entries.

Fixes #8967

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 9, 2026

Greptile Summary

This PR fixes a classic deadlock that occurs when MISE_JOBS=1 and a task's run array mixes sub-task references ({ task = \"foo\" }) with scripts. The parent task holds the sole semaphore permit, then blocks inside inject_and_wait for the sub-task — which can never acquire a permit of its own.

The fix releases the parent's OwnedSemaphorePermit immediately before each inject_and_wait call and re-acquires it afterward, threading Arc<Semaphore> and &mut Option<OwnedSemaphorePermit> through run_task_schedexec_task_run_entries. Both RunEntry::SingleTask and RunEntry::TaskGroup arms are handled symmetrically, and a focused e2e regression test is included.

Confidence Score: 5/5

Safe to merge — the fix is minimal, targeted, logically sound, and covered by a regression e2e test.

Both run-entry arms (SingleTask and TaskGroup) are handled symmetrically. The had_permit flag correctly preserves intent (non-permit orchestrator tasks are unaffected). Releasing the permit while blocked on inject_and_wait also marginally improves throughput in multi-job scenarios. No existing behavior is changed outside the deadlock path, and the regression test verifies the exact scenario from the bug report.

No files require special attention.

Vulnerabilities

No security concerns identified. The change only affects internal semaphore permit management in the task execution concurrency layer and introduces no new attack surface.

Important Files Changed

Filename Overview
src/task/task_executor.rs Core fix: releases semaphore permit before inject_and_wait and re-acquires after; both SingleTask and TaskGroup arms handled symmetrically with correct had_permit guard.
src/cli/run.rs Threads Arc and &mut Option from spawn_sched_job through run_task_sched to exec_task_run_entries; no functional logic changed in the call site.
e2e/cli/test_run_subtask_jobs1 Regression test that reproduces the exact deadlock scenario (MISE_JOBS=1, mixed sub-task + script run array) and asserts ordered output "foo\nbar".

Sequence Diagram

sequenceDiagram
    participant Sched as Scheduler
    participant Parent as parent task
    participant Sem as Semaphore (JOBS=1)
    participant Sub as sub-task (foo)

    Sched->>Sem: acquire_owned() → permit
    Sem-->>Sched: permit ✓
    Sched->>Parent: run_task_sched(permit)
    Note over Parent: exec_task_run_entries
    Note over Parent: RunEntry::SingleTask detected
    Parent->>Sem: drop permit (had_permit=true)
    Parent->>Sched: inject_and_wait → sched_tx.send(sub-task)
    Sched->>Sem: acquire_owned() for sub-task
    Sem-->>Sched: permit ✓  (now available)
    Sched->>Sub: run sub-task
    Sub-->>Sched: done
    Sched->>Sem: drop sub-task permit
    Parent->>Sem: re-acquire permit
    Sem-->>Parent: permit ✓
    Note over Parent: RunEntry::Script → exec_script("echo bar")
    Parent-->>Sched: done
Loading

Reviews (4): Last reviewed commit: "fix(task): allow clippy::too_many_argume..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a deadlock issue that occurs when MISE_JOBS=1 by ensuring that semaphore permits are released before waiting on sub-tasks and re-acquired afterward. It also adds a regression test to verify this behavior. I have no feedback to provide.

@jdx jdx enabled auto-merge (squash) April 9, 2026 15:01
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 9, 2026

Hyperfine Performance

mise x -- echo

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.7 x -- echo 24.1 ± 0.8 22.4 27.5 1.00
mise x -- echo 25.1 ± 0.8 22.9 27.4 1.04 ± 0.05

mise env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.7 env 23.7 ± 0.7 22.2 28.9 1.00
mise env 24.0 ± 0.7 22.0 29.8 1.01 ± 0.04

mise hook-env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.7 hook-env 24.6 ± 0.7 22.5 26.7 1.00
mise hook-env 24.7 ± 0.7 23.1 28.3 1.00 ± 0.04

mise ls

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.7 ls 21.8 ± 0.9 20.0 25.3 1.00
mise ls 22.1 ± 0.8 20.2 26.3 1.01 ± 0.05

xtasks/test/perf

Command mise-2026.4.7 mise Variance
install (cached) 154ms 155ms +0%
ls (cached) 81ms 83ms -2%
bin-paths (cached) 85ms 85ms +0%
task-ls (cached) 818ms 814ms +0%

jdx and others added 3 commits April 9, 2026 15:26
@jdx jdx merged commit edd79aa into main Apr 9, 2026
36 checks passed
@jdx jdx deleted the fix/task-hang-jobs-1 branch April 9, 2026 16:07
mise-en-dev added a commit that referenced this pull request Apr 10, 2026
### 🚀 Features

- **(config)** add lockfile_platforms setting to restrict lockfile
platforms by @cameronbrill in
[#8966](#8966)
- **(sandbox)** support wildcard patterns in allow_env by @jdx in
[#8974](#8974)
- bump usage-lib v2 → v3 to render examples in task --help by @baby-joel
in [#8890](#8890)

### 🐛 Bug Fixes

- **(activate)** handle empty __MISE_FLAGS array with set -u on bash 3.2
by @jdx in [#8988](#8988)
- **(env)** add trace logging for module hook PATH diagnostics by @jdx
in [#8981](#8981)
- **(go)** Query module proxy directly for version resolution by @c22 in
[#8968](#8968)
- **(install)** render tera templates in tool postinstall hooks by @jdx
in [#8978](#8978)
- **(install)** add missing env vars to tool postinstall hooks by @jdx
in [#8977](#8977)
- **(task)** prevent hang when skipped task has dependents by @jdx in
[#8937](#8937)
- **(task)** invalidate dependent task sources when dependency runs by
@jdx in [#8975](#8975)
- **(task)** prevent deadlock when MISE_JOBS=1 with sub-task references
by @jdx in [#8976](#8976)
- **(task)** fetch remote task files before parsing usage specs by @jdx
in [#8979](#8979)
- **(task)** prevent panic when running parallel sub-tasks with
replacing output by @jdx in
[#8986](#8986)
- **(upgrade)** update lockfile and config when upgrading to specific
version by @jdx in [#8983](#8983)

### 📚 Documentation

- **(node)** remove "recommended for teams" from pin example by @jdx in
[b334363](b334363)

### 📦️ Dependency Updates

- update ghcr.io/jdx/mise:alpine docker digest to 17a29f2 by
@renovate[bot] in [#8995](#8995)
- update docker/dockerfile:1 docker digest to 2780b5c by @renovate[bot]
in [#8994](#8994)

### New Contributors

- @baby-joel made their first contribution in
[#8890](#8890)
- @cameronbrill made their first contribution in
[#8966](#8966)
- @c22 made their first contribution in
[#8968](#8968)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant