feat: job table orderd topological when run is started#4018
feat: job table orderd topological when run is started#4018johanneskoester merged 12 commits intosnakemake:mainfrom
Conversation
…calculated first are displayed first in the jobs table. Jobs that are executed simultaneously are sorted alphabetically in the table.
…needs further work
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaced alphabetical per-rule job counting with a topologically ordered aggregation for DAG stats, moved Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI
participant DAG as DAG (src/snakemake/dag.py)
participant Topo as Toposort traversal
participant Counter as Counter/Rows builder
participant Tab as Tabulate/Formatter
CLI->>DAG: request stats (needrun, finished)
DAG->>Topo: compute ordered job list from relevant_jobs
Topo-->>DAG: ordered unique jobs
DAG->>Counter: count occurrences per ordered job, add total
Counter-->>Tab: rows and stats_dict
Tab-->>CLI: formatted stats table output
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake/dag.py`:
- Around line 3197-3208: The current loop builds orderd_jobs by iterating
self.toposorted() (all jobs) which diverges from the filtered set used to
compute total_count and also uses repeated orderd_jobs.count(...) calls making
it quadratic; change the logic in the block that constructs
orderd_jobs/deduplicated_job_list so it only iterates the same filtered job set
used for total_count (e.g., the needrun/finished filtered list) instead of
self.toposorted(), and replace repeated .count(...) calls with a single pass
counter (e.g., collections.Counter) to compute counts once, then populate rows
and stats_dict from that counter; update references to orderd_jobs,
deduplicated_job_list, rows, and stats_dict accordingly so ordering and counts
match the filtered jobs.
In `@tests/tests.py`:
- Around line 2978-2981: The except block that currently does "assert False"
should instead raise an AssertionError with a clear message and chain the
original ValueError for context: catch the ValueError from names.index(expected,
idx) and raise AssertionError(f"Expected rule '{expected}' not found in stats
output") from the caught ValueError so the original traceback is preserved (look
for the try/except around names.index(expected, idx) using variables idx, names,
expected).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f536ffb4-8a0d-4b6c-b210-4ca141f24f0a
📒 Files selected for processing (3)
src/snakemake/dag.pytests/test_issue3922/Snakefiletests/tests.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/tests.py`:
- Line 2952: Replace the bare open call with a context manager to ensure the
file handle is closed: change the line that assigns txt from open(stats_path,
"r", encoding="utf-8").read() to using with open(stats_path, "r",
encoding="utf-8") as f: txt = f.read() so the file is properly closed after
reading.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a287f8ad-c58d-4b2e-80d9-25697e32827e
📒 Files selected for processing (2)
src/snakemake/dag.pytests/tests.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/tests.py (1)
2957-3007: Well-implemented test for topological ordering and counts.The past review issues have been addressed:
- File handle now properly uses a context manager (line 2968-2969)
- Exception chaining correctly preserves the original
ValueErrorcontext (lines 2996-2999)The test logic correctly verifies that the expected rules appear as a subsequence in topological order.
Optional improvement: Consider cleaning up the temp directory on successful test completion to avoid accumulating leftover directories in CI environments. Other tests in this file follow a pattern of manual cleanup when using
cleanup=False:♻️ Optional cleanup suggestion
def test_stats_table_order_and_counts(): # Run snakemake in the example dir and ask it to write stats.txt outdir = run( dpath("test_issue3922"), shellcmd="snakemake --dryrun > stats.txt 2>&1", check_results=False, cleanup=False, ) stats_path = os.path.join(outdir, "stats.txt") assert os.path.exists(stats_path), "stats.txt was not created" - with open(stats_path, "r", encoding="utf-8") as f: - txt = f.read() + try: + with open(stats_path, "r", encoding="utf-8") as f: + txt = f.read() # ... rest of test logic ... + finally: + shutil.rmtree(outdir, ignore_errors=ON_WINDOWS)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/tests.py` around lines 2957 - 3007, The test test_stats_table_order_and_counts leaves the temp output directory when run(..., cleanup=False), which can accumulate artifacts in CI; update the test to remove the created outdir on success (mirroring other tests) by calling the same cleanup routine used elsewhere after the assertions (e.g., remove the directory returned by run/dpath or call the project-specific cleanup helper), ensuring cleanup runs only after all assertions pass and does not mask failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/tests.py`:
- Around line 2957-3007: The test test_stats_table_order_and_counts leaves the
temp output directory when run(..., cleanup=False), which can accumulate
artifacts in CI; update the test to remove the created outdir on success
(mirroring other tests) by calling the same cleanup routine used elsewhere after
the assertions (e.g., remove the directory returned by run/dpath or call the
project-specific cleanup helper), ensuring cleanup runs only after all
assertions pass and does not mask failures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 31e07bf0-5d4d-4620-9f65-65e77548a9a6
📒 Files selected for processing (2)
src/snakemake/dag.pytests/tests.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/snakemake/dag.py
🤖 I have created a release *beep* *boop* --- ## [9.17.0](v9.16.3...v9.17.0) (2026-03-13) ### Features * Allow storing snakemake metadata in files or databases ([#4012](#4012)) ([dd75f31](dd75f31)) * Allow to specify comparison command per-unit test ([#3956](#3956)) ([b88171c](b88171c)) * job table orderd topological when run is started ([#4018](#4018)) ([75cf506](75cf506)) * lambda functions for priority in rules ([#3253](#3253)) ([d2aa226](d2aa226)) * Make on... directive of modules accessible ([#4050](#4050)) ([e9f2e1c](e9f2e1c)) ### Bug Fixes * adjust conda tests to not fail on apple silicon; fix [#4040](#4040) ([#4049](#4049)) ([f5b0142](f5b0142)) * allow "--containerize apptainer" to output apptainer format instead of dockerfile ([#4030](#4030)) ([f5cac30](f5cac30)) * apptainer command not recognized when singularity is absent ([#4010](#4010)) ([b8162e2](b8162e2)) * capture stderr when tests fail ([#3995](#3995)) ([97d74ba](97d74ba)) * **docs:** make Data-dependent conditional execution a complete example ([#4043](#4043)) ([3a1d7f2](3a1d7f2)) * don't build the DAG when running unlock. Fixes [#4000](#4000) and [#198](#198) ([#4007](#4007)) ([acf79fd](acf79fd)) * Ensure pixi tasks may be run as advertised ([#4046](#4046)) ([88253c2](88253c2)) * fix checkpoint handling corner cases ([#3870](#3870) and [#3559](#3559)) ([#4015](#4015)) ([63f4257](63f4257)) * issue 3642 ([#4054](#4054)) ([76e6fc2](76e6fc2)) * issue 3815 ([#4026](#4026)) ([b0eec96](b0eec96)) * logging None in shellcmd context causes error ([#4064](#4064)) ([d0652cd](d0652cd)) * lookup function returns default value for empty DataFrame queries ([#4056](#4056)) ([f71de97](f71de97)) * make `cache: omit-software` a rule specific property ([#4085](#4085)) ([034a9e7](034a9e7)) * reduce number of tests leaving temporary files behind ([#4033](#4033)) ([a3a1c97](a3a1c97)) * regression in dynamic resource handling ([#4038](#4038)) ([f2c554a](f2c554a)) * somewhat shorter announce message ([#4080](#4080)) ([57efc71](57efc71)) ### Performance Improvements * switch reretry with tenacity; decouple container classes (with Python 3.7 compat for old scripts) from rest of the codebase (enabling moving to newer python versions) ([#4032](#4032)) ([ffb19e7](ffb19e7)) ### Documentation * Add AI-assisted contributions policy to contributing guidelines ([#4051](#4051)) ([dd70526](dd70526)) * **codebase:** Update & simplify plugin architecture section ([#4052](#4052)) ([176cf63](176cf63)) * Correct workflow.source_path() description in documentation ([#4036](#4036)) ([45883c5](45883c5)) * fixed wrong code example for collect() function ([#4037](#4037)) ([5c85ed8](5c85ed8)) * Minor docs improvements ([#4089](#4089)) ([29ea226](29ea226)) * switch to sphinx_design for tabs ([#3976](#3976)) ([9674614](9674614)) * typo in the migration table breaking a pip install command ([#4024](#4024)) ([66f9dda](66f9dda)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This PR changes the sorting of the job table that is created at the beginning of a run. The jobs displayed here are no longer sorted alphabetically, but in the order in which they are sorted in the DAG (i.e., in the order in which they should be calculated). If there are many jobs in a level of the DAG, they are sorted alphabetically in the table.
This could close:
Fixes #1767
Fixes #3922
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).--> I haven't found anywhere here that explicitly mentions this table. If necessary, I will add it.
Summary by CodeRabbit
Bug Fixes
Tests