Skip to content

test: add test that #3913 is fixed by #4026#4022

Merged
johanneskoester merged 3 commits intosnakemake:mainfrom
edbennett:3913-overallocation-free-without-consume
Mar 11, 2026
Merged

test: add test that #3913 is fixed by #4026#4022
johanneskoester merged 3 commits intosnakemake:mainfrom
edbennett:3913-overallocation-free-without-consume

Conversation

@edbennett
Copy link
Copy Markdown
Contributor

@edbennett edbennett commented Mar 10, 2026

Before this change, in some cases jobs were started without calling update_available_resources to subtract the resources used from the resource pool. Regardless, when the job completed, the resources were re-added to the resource pool.
This change ensures that the resource pool is always updated, and slightly rearranges the logic to avoid duplicating the call to job_scheduler_greedy.

The PR also includes a test that failed before the change, and now passes; it tests the behaviour described in the original issue (resources over allocated), rather than the specific bug fixed above.

fixes #3913

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • Tests
    • Added test suite validating thread allocation and resource constraints during parallel workflow execution
    • Test verifies thread usage remains within configured limits across concurrent tasks
    • Validates job scheduling behavior and completion tracking with multi-threaded rule execution

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 10, 2026

📝 Walkthrough

Walkthrough

This PR adds test coverage for GitHub issue #3913, where the greedy scheduler schedules jobs in parallel such that combined thread requirements exceed the provided core limit. The test reproduces the issue scenario and verifies the scheduler respects the specified core limit.

Changes

Cohort / File(s) Summary
Greedy Scheduler Test Case
tests/test_github_issue3913/Snakefile
Adds three Snakemake rules (final, prepare, do_something) that create a scenario where multiple jobs with 8 threads each could be scheduled in parallel, exceeding the 8-core limit.
Test Validation Logic
tests/tests.py
Adds test_github_issue3913 function that executes the workflow with fixed core limit and greedy scheduler, parses execution logs to track active job thread counts, and verifies total active threads never exceed the core limit or per-job thread requirements.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The description explains the bug fix, includes test details, and the author completed both QC checkboxes as required by the template.
Linked Issues check ✅ Passed The changes add a test that reproduces the overallocation issue from #3913, verifying that the greedy scheduler no longer exceeds core limits.
Out of Scope Changes check ✅ Passed Both file changes (Snakefile and test) are directly related to validating the fix for issue #3913, with no unrelated modifications.
Title check ✅ Passed The title references a specific issue (#3913) and PR (#4026), accurately describing that this PR adds a test demonstrating the fix for that issue.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/tests.py`:
- Around line 2939-2945: The test test_github_issue3913 fails on Windows because
its driven Snakefile uses Unix-only shell commands (touch/sleep); modify the
test to skip on Windows by adding the `@skip_on_windows` decorator (or equivalent
skip marker) to test_github_issue3913, or make the fixture portable by replacing
shell commands in the referenced Snakefile/run(...) invocation with
cross-platform Python-based operations (e.g., using python -c to create files
and sleep) so the test passes on Windows CI.
- Around line 2947-2974: The test currently can pass trivially because
active_job_thread_counts stays empty; modify the loop that reads the log (which
builds active_job_thread_counts using variables current_jobid and finished_jobid
and compares against core_count_limit) to track whether any job lifecycle lines
were actually observed (e.g., set a boolean saw_job_lifecycle when encountering
"jobid:", "threads", "resources:" or "Finished jobid:"), assert at the end of
the log processing that saw_job_lifecycle is true, and also assert that
active_job_thread_counts is empty after the loop finishes (ensuring all tracked
jobs were finished); keep the existing per-line thread-limit checks but only
rely on the trivial-pass path when appropriate and fail the test if no job lines
were parsed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd610d05-385a-4b7f-9c8a-60a473120aed

📥 Commits

Reviewing files that changed from the base of the PR and between b88171c and 5bcbc3b.

📒 Files selected for processing (6)
  • src/snakemake/scheduling/job_scheduler.py
  • tests/test_github_issue3913/Snakefile
  • tests/test_github_issue3913/expected-results/common
  • tests/test_github_issue3913/expected-results/file1
  • tests/test_github_issue3913/expected-results/file2
  • tests/tests.py

@johanneskoester
Copy link
Copy Markdown
Contributor

@edbennet, I just merged a PR by @fgvieira (#4026) that fixes the same issue. We decided that the advantage of not using the complex scheduler (LP) for just a single job is so marginal that it is not worth the additional code. However, your test case is still worth adding! Can you maybe resolve the conflict and then ping me via discord?

@edbennett edbennett force-pushed the 3913-overallocation-free-without-consume branch from c75b87b to 1f7d665 Compare March 11, 2026 08:16
@edbennett edbennett changed the title fix: avoid freeing resources that were never allocated; fix #3913 add test that #3913 is fixed by #4026 Mar 11, 2026
@edbennett edbennett changed the title add test that #3913 is fixed by #4026 test: add test that #3913 is fixed by #4026 Mar 11, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/tests.py (1)

2998-3001: Don't tie this regression to the fixture's exact job count.

The bug here is "never exceed the effective core budget", but Lines 2998-3000 also hard-code the current workflow shape. That makes the test fail on harmless fixture changes while the scheduler behavior is still correct. Prefer asserting that the parser observed lifecycle events, that starts and finishes match, and that no tracked jobs remain active at EOF.

Suggested change
-    # Ensure that we observed every job we expect;
-    # avoid test passing spuriously if log format changes
-    assert completed_job_count == 4 and started_job_count == 4
+    # Ensure the parser actually observed job lifecycle lines without
+    # coupling the regression to the fixture's current job graph.
+    assert started_job_count > 0
+    assert completed_job_count > 0
+    assert started_job_count == completed_job_count
+    assert not active_job_thread_counts
     shutil.rmtree(tmpdir, ignore_errors=ON_WINDOWS)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tests.py` around lines 2998 - 3001, Replace the fragile exact-count
assertion that checks completed_job_count == 4 and started_job_count == 4 with a
check that ensures lifecycle consistency: assert started_job_count ==
completed_job_count and started_job_count > 0 (or >= 1) to confirm we observed
starts and finishes, and also assert there are no active/tracked jobs at EOF
(e.g., verify any active_jobs or outstanding_job set is empty). Keep the cleanup
call shutil.rmtree(tmpdir, ignore_errors=ON_WINDOWS) intact; update references
to completed_job_count and started_job_count accordingly (these variables are in
the current test scope).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/tests.py`:
- Around line 2998-3001: Replace the fragile exact-count assertion that checks
completed_job_count == 4 and started_job_count == 4 with a check that ensures
lifecycle consistency: assert started_job_count == completed_job_count and
started_job_count > 0 (or >= 1) to confirm we observed starts and finishes, and
also assert there are no active/tracked jobs at EOF (e.g., verify any
active_jobs or outstanding_job set is empty). Keep the cleanup call
shutil.rmtree(tmpdir, ignore_errors=ON_WINDOWS) intact; update references to
completed_job_count and started_job_count accordingly (these variables are in
the current test scope).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b0dfb225-f34d-45a2-bc15-32565d567176

📥 Commits

Reviewing files that changed from the base of the PR and between c75b87b and 1f7d665.

📒 Files selected for processing (5)
  • tests/test_github_issue3913/Snakefile
  • tests/test_github_issue3913/expected-results/common
  • tests/test_github_issue3913/expected-results/file1
  • tests/test_github_issue3913/expected-results/file2
  • tests/tests.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_github_issue3913/Snakefile

@johanneskoester johanneskoester merged commit e6ca6f8 into snakemake:main Mar 11, 2026
62 checks passed
@johanneskoester
Copy link
Copy Markdown
Contributor

Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Greedy scheduler uses more cores than provided

2 participants