Skip to content

fix(tasks): deduplicate shared deps across task delegation#8497

Merged
jdx merged 2 commits intojdx:mainfrom
vadimpiven:fix/task-delegation-dedup
Mar 7, 2026
Merged

fix(tasks): deduplicate shared deps across task delegation#8497
jdx merged 2 commits intojdx:mainfrom
vadimpiven:fix/task-delegation-dedup

Conversation

@vadimpiven
Copy link
Copy Markdown
Contributor

@vadimpiven vadimpiven commented Mar 7, 2026

When a task delegates via run = [{ task }], the sub-graph created in inject_and_wait had its own sent HashSet, causing shared dependencies to run multiple times. Pass the parent Deps to sub-graphs so they can prune tasks already started or completed in the parent graph.

Here is the plan I used to make Claude implement the fix:

Details # Task delegation (`run = [{ task }]`) bypasses dedup

mise 2026.3.4 macos-arm64

Reproduction

# mise.toml
[tasks.setup]
run = "echo '>>> SETUP RUNNING <<<'"

[tasks.lint]
hide = true
depends = ["setup"]
run = "echo 'lint done'"

[tasks."check:security"]
hide = true
depends = ["setup"]
run = "echo 'security check done'"

[tasks."fix:security"]
hide = true
run = [{ task = "check:security" }]

[tasks.all]
depends = ["lint", "fix:security"]
$ mise run -f all
[setup] >>> SETUP RUNNING <<<
[setup] Finished in 5.1ms
[lint] lint done
[setup] >>> SETUP RUNNING <<<    # duplicate
[setup] Finished in 4.5ms
[check:security] security check done

Expected: setup runs once.

Root cause

inject_and_wait (task_executor.rs) builds a fresh Deps graph for
each run = [{ task }] entry. Each sub-graph has its own sent
HashSet, so it knows nothing about tasks the parent already ran.

Fix

Snapshot the parent graph's confirmed-complete (removed) task keys
before spawning the sub-graph. The sub-graph constructor prunes
matching nodes at construction time. Only completed tasks are pruned
-- in-flight tasks stay in the sub-graph to preserve dependency
ordering.

1. src/task/deps.rs

Two additions to impl Deps:

new_pruned -- builds a sub-graph with completed tasks removed.
Does not pre-populate removed, keeping cycle detection intact
(sent.len() == removed.len() starts balanced at 0):

pub async fn new_pruned(
    config: &Arc<Config>,
    tasks: Vec<Task>,
    completed: &HashSet<TaskKey>,
) -> eyre::Result<Self> {
    let mut deps = Self::new(config, tasks).await?;
    let mut to_remove = vec![];
    for idx in deps.graph.node_indices() {
        let key = task_key(&deps.graph[idx]);
        if completed.contains(&key) {
            to_remove.push(idx);
        }
    }
    // Reverse order: petgraph swap-removes, so high indices first
    to_remove.sort_unstable_by(|a, b| b.cmp(a));
    for idx in to_remove {
        deps.graph.remove_node(idx);
    }
    Ok(deps)
}

handled_task_keys -- returns only removed (confirmed-complete).
Excludes in-flight (sent) tasks so sub-graphs preserve dependency
ordering for tasks still running in the parent:

pub fn handled_task_keys(&self) -> HashSet<TaskKey> {
    self.removed.clone()
}

2. src/task/mod.rs

Re-export TaskKey (keep deps module private):

pub use deps::{Deps, TaskKey};

3. src/task/task_executor.rs

Thread completed_tasks: HashSet<TaskKey> through three methods.

run_task_sched takes ownership (value moved from spawn site):

pub async fn run_task_sched(
    &self,
    task: &Task,
    config: &Arc<Config>,
    sched_tx: Arc<mpsc::UnboundedSender<(Task, Arc<Mutex<Deps>>)>>,
    completed_tasks: HashSet<TaskKey>,  // new
) -> Result<()> {

exec_task_run_entries and inject_and_wait borrow it:

async fn exec_task_run_entries(
    ...
    completed_tasks: &HashSet<TaskKey>,  // new
) -> Result<()> {
async fn inject_and_wait(
    ...
    completed_tasks: &HashSet<TaskKey>,  // new
) -> Result<()> {
    ...
    // was: Deps::new(config, to_run).await?
    let sub_deps = Deps::new_pruned(config, to_run, completed_tasks).await?;

4. src/cli/run.rs

spawn_sched_job -- snapshot before calling run_task_sched:

let completed = deps_for_remove.lock().await.handled_task_keys();
let result = this
    .run_task_sched(&task, &ctx.config, ctx.sched_tx.clone(), completed)
    .await;

Run::run_task_sched wrapper -- forward the new parameter:

async fn run_task_sched(
    &self,
    task: &Task,
    config: &Arc<Config>,
    sched_tx: Arc<tokio::sync::mpsc::UnboundedSender<(Task, Arc<Mutex<Deps>>)>>,
    completed_tasks: std::collections::HashSet<crate::task::TaskKey>,  // new
) -> Result<()> {

E2e test

e2e/tasks/test_task_delegation_dedup -- serializes execution
(fix:security depends on lint) so setup is confirmed complete
before the sub-graph is created. Asserts setup runs exactly once
and standalone mise run -f check:security still triggers setup:

#!/usr/bin/env bash

cat <<'EOF' >mise.toml
[tasks.setup]
run = "echo '>>> SETUP RUNNING <<<'"

[tasks.lint]
hide = true
depends = ["setup"]
run = "echo 'lint done'"

[tasks."check:security"]
hide = true
depends = ["setup"]
run = "echo 'security check done'"

[tasks."fix:security"]
hide = true
depends = ["lint"]
run = [{ task = "check:security" }]

[tasks.all]
depends = ["fix:security"]
EOF

output="$(mise run -f all 2>&1)"
# Exclude command echo line ([setup] $ echo '...') from count
count="$(echo "$output" | grep -v '^\[setup\] \$' \
  | grep -c '>>> SETUP RUNNING <<<')"
if [[ $count -ne 1 ]]; then
  echo "Expected setup to run exactly once, but it ran $count times"
  echo "Full output:"
  echo "$output"
  exit 1
fi

assert_contains "echo \"$output\"" "lint done"
assert_contains "echo \"$output\"" "security check done"

assert_contains "mise run -f check:security" ">>> SETUP RUNNING <<<"
assert_contains "mise run -f check:security" "security check done"

Files changed

File Change
src/task/deps.rs Add new_pruned, handled_task_keys
src/task/mod.rs Re-export TaskKey alongside Deps
src/task/task_executor.rs Thread completed_tasks, use new_pruned
src/cli/run.rs Snapshot via handled_task_keys(), forward
e2e/tasks/test_task_delegation_dedup Regression test

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where tasks delegated via run = [{ task }] could lead to shared dependencies being executed multiple times. The core change involves enhancing the dependency graph (Deps) to be aware of its parent's execution state. By passing the parent's dependency information to sub-graphs, tasks that have already been started or completed in the parent context are now correctly pruned, preventing redundant work and improving efficiency in task execution.

Highlights

  • Dependency Graph Management: Introduced a mechanism to pass parent dependency state to sub-graphs created during task delegation, ensuring shared dependencies are not re-executed.
  • New Deps Constructor: Added Deps::new_with_parent which prunes tasks from a sub-graph that have already been started or completed by its parent graph.
  • Runtime Deduplication: Implemented a 'defense-in-depth' check within emit_leaves to prevent tasks from being scheduled if they are already in-flight or completed in the parent graph.
  • API Updates: Modified several function signatures across src/cli/run.rs, src/task/deps.rs, and src/task/task_executor.rs to propagate the parent dependency state.
Changelog
  • src/cli/run.rs
    • Modified the run_task_sched function signature to accept a parent_deps argument.
    • Updated calls to run_task_sched to correctly pass the new parent_deps argument.
  • src/task/deps.rs
    • Removed the Clone trait from the Deps struct to prevent unintended cloning of dependency state.
    • Added an optional parent field (Option<Arc<Mutex<Deps>>>) to the Deps struct to link sub-graphs to their parents.
    • Introduced a new asynchronous constructor new_with_parent that creates a Deps instance and prunes tasks already handled by a provided parent Deps graph.
    • Enhanced the emit_leaves method with a 'defense-in-depth' check to prevent tasks already in-flight or completed in the parent graph from being scheduled by the sub-graph.
    • Imported tokio::sync::Mutex for thread-safe access to shared dependency state.
  • src/task/task_executor.rs
    • Modified the signatures of run_task_sched, run_scripts, and inject_and_wait to accept a parent_deps argument.
    • Updated internal calls within these functions to propagate the parent_deps argument.
    • Changed the inject_and_wait method to utilize Deps::new_with_parent instead of Deps::new when creating sub-dependency graphs for delegated tasks, ensuring proper deduplication of shared dependencies.
Activity
  • No specific reviewer activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request resolves an issue with duplicate execution of shared dependencies in delegated tasks by passing parent dependency state to sub-graphs, allowing them to prune tasks that are already running or completed. However, a logic error was identified in src/task/deps.rs regarding how in-flight tasks are handled during sub-graph construction, which could lead to dependency bypass. Despite this, the overall implementation is robust in addressing concurrency concerns, and other changes in src/cli/run.rs and src/task/task_executor.rs were found to be secure.

@vadimpiven vadimpiven force-pushed the fix/task-delegation-dedup branch from 96888cb to 67a1157 Compare March 7, 2026 14:29
@jdx
Copy link
Copy Markdown
Owner

jdx commented Mar 7, 2026

Confirmed this is a real bug. The root cause is clear: inject_and_wait in task_executor.rs:405 creates a completely independent Deps graph via Deps::new(config, to_run) with its own empty sent/removed HashSets, so it has no knowledge of what the parent graph has already scheduled or completed.

In the reproduction case:

  1. Parent graph runs setup as a dependency of lint
  2. fix:security delegates to check:security via run = [{ task }], which calls inject_and_wait and creates a new Deps graph
  3. check:security depends on setup → sub-graph doesn't know setup already ran → runs it again ✗

The fix approach is sound, but I think you should try a simpler implementation first: just pass a HashSet<TaskKey> of already-started/completed task keys to the sub-graph instead of the full Arc<Mutex<Deps>>. That would avoid the try_lock defense-in-depth logic in emit_leaves, the parent field on Deps, and the extra mutex coordination. A snapshot of the parent's state at sub-graph construction time should be sufficient for the common case.

This comment was generated by Claude Code.

@vadimpiven vadimpiven force-pushed the fix/task-delegation-dedup branch 2 times, most recently from a175b08 to 4873cad Compare March 7, 2026 14:46
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 7, 2026

Greptile Summary

This PR fixes a deduplication bug where shared task dependencies were re-executed when a task used run = [{ task }] delegation. Each call to inject_and_wait previously created a fresh Deps graph with no knowledge of what the parent graph had already run, causing tasks like setup to fire once per delegation chain. The fix threads a point-in-time snapshot of the parent graph's confirmed-complete task keys (Deps::removed) into new_pruned, which strips those nodes before the sub-graph is subscribed.

Key changes:

  • Deps::new_pruned builds a sub-graph with already-completed nodes removed; handled_task_keys() exposes only the removed set (not in-flight sent) so the snapshot is always for tasks that have definitively finished.
  • completed_tasks is captured in spawn_sched_job right before run_task_sched begins — at this point all transitive dependencies of the current task are guaranteed to be in removed, making the snapshot consistent for the reported scenario.
  • Gap: the snapshot is not refreshed between sequential inject_and_wait calls within the same run array. If a task has multiple run = [{ task }] entries that share a dependency not yet in the parent's removed at start time, that dependency will run once per entry (the completions inside the first sub-graph are not visible to the second call).

The e2e test validates the specific scenario from the issue report and successfully verifies the fix.

Confidence Score: 3/5

  • Fix resolves the reported issue but leaves a gap for shared dependencies across multiple delegation entries within the same task.
  • The core fix correctly addresses the deduplication bug described in the issue: shared task dependencies no longer re-execute when a parent task delegates via run = [{ task }]. However, there is an identified limitation: when a single task has multiple delegation entries in its run array that share a common dependency, that dependency may still execute once per entry if not already completed in the parent graph at the time the task started. This is a less common scenario than the primary bug, and the fix is sound for the reported case. No data loss or security impact.
  • src/task/task_executor.rs — consider refreshing the snapshot between sequential inject_and_wait calls to fully deduplicate shared dependencies across multiple delegation entries in the same run array.

Important Files Changed

Filename Overview
src/task/deps.rs Adds new_pruned to build a sub-graph with already-completed tasks removed, and handled_task_keys() to snapshot the confirmed-complete set. The logic is sound — removing completed nodes from the graph before subscription ensures they don't re-execute. The implementation correctly handles transitive dependencies by only pruning tasks that have definitively finished (petgraph processes removals fully), so no orphaned dependents can occur.
src/task/task_executor.rs Threads completed_tasks: HashSet<TaskKey> through run_task_sched, exec_task_run_entries, and inject_and_wait, and uses Deps::new_pruned instead of Deps::new. The snapshot correctly captures all tasks completed before the current task started. However, the snapshot is not refreshed between sequential inject_and_wait calls within the same run array, so shared dependencies across multiple delegation entries can duplicate if not already in the parent's removed at snapshot time.
src/cli/run.rs Snapshots handled_task_keys() (parent graph's removed set) in spawn_sched_job at line 486 just before calling run_task_sched, and threads it through. The snapshot correctly captures all tasks completed before the current task started, since its transitive dependencies must be in removed by that point. The forwarding is correct.
src/task/mod.rs Re-exports TaskKey alongside Deps so callers outside the task module can reference the type without exposing the internal deps module — minimal, correct change.
e2e/tasks/test_task_delegation_dedup Regression test verifying setup runs exactly once during the delegated task chain scenario described in the issue. The test correctly validates that the fix resolves the primary bug. Execution order is serialized by depends = ["lint"] on fix:security, ensuring all dependencies are confirmed complete before delegation begins.

Comments Outside Diff (1)

  1. src/task/task_executor.rs, line 337-375 (link)

    Stale snapshot across sequential run delegations

    completed_tasks is captured once at the start of run_task_sched and passed unchanged to every inject_and_wait call inside the run array loop. Tasks that complete inside the first sub-graph are recorded in that sub-graph's own Deps::removed — they never propagate back to completed_tasks. The second call therefore receives the same stale snapshot, and any shared dependency not already in the parent's removed at snapshot time will run once per delegation entry.

    Concrete reproduction:

    [tasks.shared-setup]
    run = "echo setup"
    
    [tasks.check-A]
    hide = true
    depends = ["shared-setup"]
    run = "echo check-A"
    
    [tasks.check-B]
    hide = true
    depends = ["shared-setup"]
    run = "echo check-B"
    
    [tasks.fix]
    run = [
      { task = "check-A" },
      { task = "check-B" },   # shared-setup will run again here
    ]

    If fix starts before shared-setup is in the parent's removed, shared-setup runs in the first sub-graph, but that fact is invisible to the second call. A potential fix would be to accumulate completions across sequential sub-graph executions (e.g., collect sub_deps.lock().await.handled_task_keys() after each inject_and_wait and merge into a local mutable set that is passed to the next call).

Last reviewed commit: 132fd58

When a task delegates via `run = [{ task }]`, the sub-graph created in
`inject_and_wait` had its own `sent` HashSet, causing shared dependencies
to run multiple times. Pass a snapshot of completed task keys to sub-graphs
so they can prune tasks already started or completed in the parent graph.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@vadimpiven vadimpiven force-pushed the fix/task-delegation-dedup branch from 0080f84 to 1914e1c Compare March 7, 2026 15:08
@jdx jdx merged commit f0c6753 into jdx:main Mar 7, 2026
34 checks passed
jdx pushed a commit that referenced this pull request Mar 7, 2026
### 🚀 Features

- **(vfox)** add `RUNTIME.envType` for libc variant detection by @malept
in [#8493](#8493)
- store provenance verification results in lockfile by @jdx in
[#8495](#8495)

### 🐛 Bug Fixes

- **(env)** skip remote version fetching for "latest" in prefer-offline
mode by @jdx in [#8500](#8500)
- **(tasks)** deduplicate shared deps across task delegation by
@vadimpiven in [#8497](#8497)
- **(windows)** correctly identify mise binary without extension by @jdx
in [#8503](#8503)

### 🚜 Refactor

- **(core)** migrate cmd! callers to async with kill_on_drop by @jdx in
[a63f7d2](a63f7d2)

### Chore

- **(ci)** temporarily disable `mise up` in release-plz by @jdx in
[#8504](#8504)
- consolidate all linters into hk.pkl by @jdx in
[#8498](#8498)

## 📦 Aqua Registry Updates

#### New Packages (1)

- [`apache/ant`](https://github.com/apache/ant)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants