fix: avoid checking output files in immediate-submit mode#3569
fix: avoid checking output files in immediate-submit mode#3569johanneskoester merged 15 commits intosnakemake:mainfrom
Conversation
|
Warning Rate limit exceeded@johanneskoester has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 20 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughImplements immediate-submit semantics across scheduling and postprocessing, adds a CLI help clarification, and introduces a non-shared-filesystem integration test using a custom cluster submission/execution script. The scheduler now skips local rules under immediate-submit and propagates ignore-missing-output behavior into job postprocessing. Tests verify end-to-end behavior. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant CLI as CLI
participant Scheduler as JobScheduler
participant Exec as Executor
participant Job as Job
participant FS as Storage/FS
User->>CLI: snakemake --immediate-submit [--notemp ...]
CLI->>Scheduler: parse workflow, flags
Scheduler->>Scheduler: plan DAG
alt immediate-submit
note over Scheduler: Skip local rules (warn)
end
loop For each runnable job
Scheduler->>Exec: submit job (may be remote/subprocess)
end
Exec->>Job: run
Job->>FS: produce outputs (async)
Job-->>Scheduler: done (status)
Scheduler->>Job: postprocess(ignore_missing_output = immediate_submit or touch)
alt ignore_missing_output = true
note over Job: Do not wait for outputs
else
Job->>FS: wait for outputs/benchmark
end
sequenceDiagram
autonumber
participant Snk as Snakemake (cluster-generic)
participant Sub as clustersubmit (submit)
participant DB as SQLite DB
participant Exe as clustersubmit (--execute)
participant Sh as Shell
Snk->>Sub: submit jobscript [--dependencies ...]
Sub->>DB: insert job + edges
loop execute cycle
Exe->>DB: fetch runnable jobs
DB-->>Exe: job(s) with no unfinished parents
par async job execution
Exe->>Sh: run jobscript
Sh-->>Exe: exit code
Exe->>DB: mark job done
and possibly others
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
I modified the scheduler so that local rules break the DAG without raising an error for missing output file, but logging a warning. |
|
@cmeesters with my apologies for the long delay, I have finally implemented a test ( |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (7)
tests/test_immediate_submit_without_shared_fs/clustersubmit (2)
68-77: Hard-coded 100 iteration cap can leave jobs unprocessedThe loop stops after 100 passes even if new runnable jobs keep appearing (large DAGs).
Use awhile True:loop with an exit condition instead.- for _ in range(100): + while True: @@ - else: - logging.info("Could not find any other job to execute.") - return + if not result: + logging.info("No runnable jobs left – exiting.") + return
104-106: Dependency parsing discards empty strings – cleaner expression possibleCurrent list-comp is harder to read; using
split()without an argument already removes duplicates of whitespace.-dependencies = [s for s in args.dependencies.split(" ") if s not in [' ', '', None]] +dependencies = args.dependencies.split()tests/tests.py (2)
2476-2477: Drop the stray blank linesThese two empty lines introduce an unnecessary diff hunk and break the otherwise tight grouping of tests.
Feel free to remove them – it keeps the history cleaner.
2478-2489: Inline-concatenated shell command is hard to read & easy to mis-escapeThe multi-line literal relies on implicit string concatenation and heavy escaping for the embedded quotes. This is easy to break the next time someone touches it.
Consider building the command as a single triple-quoted string (or
"\n".join([...])) so that quoting is obvious and you only escape once. Example:- shellcmd="snakemake " - "--executor cluster-generic " - "--cluster-generic-submit-cmd \"./clustersubmit --dependencies \\\"{dependencies}\\\"\" " - "--forceall " - "--immediate-submit " - "--notemp " - "--jobs 10 " - "; ./clustersubmit --execute" + shellcmd=( + "snakemake " + "--executor cluster-generic " + "--cluster-generic-submit-cmd \"./clustersubmit --dependencies \\\"{dependencies}\\\"\" " + "--forceall " + "--immediate-submit " + "--notemp " + "--jobs 10 " + "; ./clustersubmit --execute" + )This keeps the final string identical while being far less brittle.
tests/test_immediate_submit_without_shared_fs/Snakefile (3)
10-11: Preferprintfoverechofor portable, predictable output
echointerpretation (e.g.-n, backslash escapes) varies between shells.
printfis portable and lets you control the trailing newline explicitly:- shell: "echo {wildcards.fname} > {output.txt}" + shell: "printf '%s\n' {wildcards.fname} > {output.txt}"
18-19: Minor: avoid UUOC (“useless use of cat”)You can pass the file directly to
wcand drop the extra process:- shell: "cat {input.txt} | wc -m > {output.dat}" + shell: "wc -m < {input.txt} > {output.dat}"This slightly speeds up the rule and is a common shell best practice.
29-33: Quote loop variable to survive spaces in file namesIf a file path ever contains spaces the current loop breaks.
Quoting fixes that and removes a useless subshell:- for i in {input.counts}; do - SUM=$(($SUM + $(cat $i))); + for i in {input.counts}; do + SUM=$((SUM + $(cat "$i"))) done;Not critical for the current test files, but makes the rule future-proof.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/snakemake/cli.py(1 hunks)src/snakemake/jobs.py(1 hunks)src/snakemake/scheduler.py(4 hunks)tests/test_immediate_submit_without_shared_fs/Snakefile(1 hunks)tests/test_immediate_submit_without_shared_fs/clustersubmit(1 hunks)tests/test_immediate_submit_without_shared_fs/expected-results/report.yaml(1 hunks)tests/tests.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tests/test_immediate_submit_without_shared_fs/expected-results/report.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- src/snakemake/scheduler.py
- src/snakemake/jobs.py
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
src/snakemake/cli.pytests/tests.py
🧬 Code Graph Analysis (1)
tests/tests.py (1)
tests/common.py (2)
run(151-495)dpath(32-35)
🔇 Additional comments (1)
src/snakemake/cli.py (1)
1589-1593: Help text may be outdated w.r.t. new localrule handlingThe message still says “localrules will fail”, but the scheduler change in this PR now skips localrules with a warning rather than failing.
Please align the help text with the current behaviour to avoid user confusion.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
tests/test_immediate_submit_without_shared_fs/clustersubmit (3)
101-104: Risk of duplicate job names with current generation strategy.The random job name generation could create duplicates, leading to SQLite constraint violations if a PRIMARY KEY is added (as suggested above).
Consider using a more robust job ID generation:
-job_name = ''.join( - [properties['rule'], '-'] + - [random.choice(string.ascii_lowercase) for _ in range(5)] -) +import time +job_name = f"{properties['rule']}-{int(time.time())}-{''.join(random.choices(string.ascii_lowercase, k=5))}"
70-79: Arbitrary iteration limit may cause premature termination.The hardcoded limit of 100 iterations could prevent all jobs from completing in complex DAGs. Consider using a more flexible approach.
- for _ in range(100): + max_iterations = 1000 # or make configurable + iteration = 0 + consecutive_empty_rounds = 0 + + while iteration < max_iterations and consecutive_empty_rounds < 3: + iteration += 1 with sqlite3.connect(DATABASE_FILE) as db: result = db.execute(QUERY_RUNNABLE).fetchall() if len(result): + consecutive_empty_rounds = 0 async with asyncio.TaskGroup() as tg: for row in result: tg.create_task(execute_single_job(*row)) else: + consecutive_empty_rounds += 1 - logging.info("Could not find any other job to execute.") - return + if consecutive_empty_rounds >= 3: + logging.info("No runnable jobs found after 3 attempts. Stopping.") + return + await asyncio.sleep(1) # Brief pause before retrying
12-14: Hardcoded paths may cause issues in different environments.Using hardcoded
/tmp/paths may not work reliably across all systems or when running multiple tests concurrently.Consider using environment variables or temporary directories:
+import tempfile + ## Configuration -DATABASE_FILE = "/tmp/clustersubmit.sqlite3" -LOG_FILE = "/tmp/clustersubmit.log" +DATABASE_FILE = os.environ.get("CLUSTER_DB_FILE", os.path.join(tempfile.gettempdir(), "clustersubmit.sqlite3")) +LOG_FILE = os.environ.get("CLUSTER_LOG_FILE", os.path.join(tempfile.gettempdir(), "clustersubmit.log"))
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
tests/test_immediate_submit_without_shared_fs/clustersubmit (4)
84-84: Fix typo in error message.The error message contains a typo: "Falied" should be "Failed".
- raise RuntimeError(f"Falied executing job {jobid}") + raise RuntimeError(f"Failed executing job {jobid}")
82-84: Consider sanitizing logged output to prevent information disclosure.Logging stdout/stderr at CRITICAL level could expose sensitive information or create excessively large log files. Consider truncating or sanitizing the output.
- logging.critical(stdout) - logging.critical(stderr) + logging.error(f"Job {jobid} failed with return code {proc.returncode}") + if stdout: + logging.debug(f"Job {jobid} stdout: {stdout.decode()[:500]}...") # Truncate long output + if stderr: + logging.error(f"Job {jobid} stderr: {stderr.decode()[:500]}...")
133-136: Improve job name uniqueness for concurrent submissions.Using
time.time()alone could create naming conflicts if multiple jobs are submitted simultaneously. Consider usinguuidor more granular timing.+import uuid + -job_name = ''.join( - [properties['rule'], '-', str(time.time()), '-'] + - [random.choice(string.ascii_lowercase) for _ in range(5)] -) +job_name = f"{properties['rule']}-{uuid.uuid4().hex[:8]}"
66-67: Optimize database cleanup frequency.Database cleanup runs on every script invocation, which could be inefficient for frequent submissions. Consider cleanup only when executing jobs or periodically.
- db.execute("DELETE FROM jobs WHERE created_at <= datetime('now', '-1 hour')") - db.execute("DELETE FROM edges WHERE created_at <= datetime('now', '-1 hour')") + # Only cleanup during execution mode to reduce overhead + if args.execute: + db.execute("DELETE FROM jobs WHERE created_at <= datetime('now', '-1 hour')") + db.execute("DELETE FROM edges WHERE created_at <= datetime('now', '-1 hour')")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/snakemake/cli.py(1 hunks)tests/test_immediate_submit_without_shared_fs/Snakefile(1 hunks)tests/test_immediate_submit_without_shared_fs/clustersubmit(1 hunks)tests/tests.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- src/snakemake/cli.py
- tests/tests.py
- tests/test_immediate_submit_without_shared_fs/Snakefile
🔇 Additional comments (3)
tests/test_immediate_submit_without_shared_fs/clustersubmit (3)
46-64: Database schema looks well-designed with proper constraints.The SQLite schema now includes proper primary keys, foreign key constraints, and appropriate column types. The WAL mode configuration addresses concurrency concerns effectively.
74-91: Excellent concurrency handling and error management.The async job execution function properly handles subprocess errors, uses database locking to prevent race conditions, and includes appropriate logging. The WAL mode and timeout settings address the concurrency issues from previous reviews.
122-128: Efficient single-pass file reading implementation.The job script is now read once and reused for both property extraction and database storage, eliminating the redundant file operations from previous versions.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
… DB lock Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/test_immediate_submit_without_shared_fs/clustersubmit (2)
88-88: Fix typo in error message."Faied" should be "Failed".
- raise RuntimeError(f"Faied executing job {jobid}") + raise RuntimeError(f"Failed executing job {jobid}")
162-162: Remove stray line number.The standalone "162" appears to be a formatting artifact and should be removed.
-162
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/snakemake/cli.py(1 hunks)src/snakemake/jobs.py(1 hunks)src/snakemake/scheduler.py(4 hunks)tests/test_immediate_submit_without_shared_fs/Snakefile(1 hunks)tests/test_immediate_submit_without_shared_fs/clustersubmit(1 hunks)tests/test_immediate_submit_without_shared_fs/expected-results/report.yaml(1 hunks)tests/tests.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- tests/test_immediate_submit_without_shared_fs/expected-results/report.yaml
- src/snakemake/jobs.py
- src/snakemake/cli.py
- tests/tests.py
- tests/test_immediate_submit_without_shared_fs/Snakefile
- src/snakemake/scheduler.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: johanneskoester
PR: snakemake/snakemake#3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.
Learnt from: johanneskoester
PR: snakemake/snakemake#2960
File: CHANGELOG.md:6-6
Timestamp: 2024-10-08T17:41:54.542Z
Learning: Do not review pull requests generated by Release Please for the Snakemake project.
Learnt from: johanneskoester
PR: snakemake/snakemake#2960
File: CHANGELOG.md:6-6
Timestamp: 2024-08-13T11:05:23.821Z
Learning: Do not review pull requests generated by Release Please for the Snakemake project.
Learnt from: johanneskoester
PR: snakemake/snakemake#3140
File: snakemake/dag.py:1308-1308
Timestamp: 2024-10-14T09:42:11.571Z
Learning: In `snakemake/dag.py`, avoid flagging missing lines or indentation issues when there is no clear syntax or logical error to prevent false positives.
Learnt from: leoschwarz
PR: snakemake/snakemake#3176
File: tests/test_output_index.py:99-157
Timestamp: 2025-01-17T12:00:09.368Z
Learning: New test dependencies for Snakemake should be introduced in separate PRs rather than being added as part of feature or refactoring PRs.
Learnt from: johanneskoester
PR: snakemake/snakemake#3014
File: snakemake/workflow.py:1250-1251
Timestamp: 2024-08-13T16:22:58.430Z
Learning: Avoid commenting on trailing commas in code reviews for the Snakemake repository as per the user's preference.
Learnt from: johanneskoester
PR: snakemake/snakemake#3014
File: snakemake/workflow.py:1250-1251
Timestamp: 2024-10-08T17:41:54.542Z
Learning: Avoid commenting on trailing commas in code reviews for the Snakemake repository as per the user's preference.
Learnt from: johanneskoester
PR: snakemake/snakemake#3204
File: tests/test_cores_cluster/qsub:1-6
Timestamp: 2024-11-12T20:22:54.184Z
Learning: In the Snakemake codebase, the `tests/test_cores_cluster/qsub` script is a dummy script for testing, and input validation and error handling are not required in such scripts.
Learnt from: lczech
PR: snakemake/snakemake#3113
File: snakemake/scheduler.py:912-914
Timestamp: 2024-10-04T16:12:18.927Z
Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
Learnt from: kdm9
PR: snakemake/snakemake#3562
File: src/snakemake/checkpoints.py:90-90
Timestamp: 2025-05-06T01:37:23.382Z
Learning: In Snakemake checkpoints implementation, tracking only the first missing output for each checkpoint is sufficient, because if one output is missing, all outputs for that checkpoint are considered incomplete. This was the behavior before PR #3562 and maintained in the pluralized `checkpoints.get()` implementation.
Learnt from: johanneskoester
PR: snakemake/snakemake#3600
File: src/snakemake/jobs.py:960-964
Timestamp: 2025-05-23T09:40:24.474Z
Learning: In the `cleanup` method of the `Job` class in `src/snakemake/jobs.py`, files in the `to_remove` list should be formatted with `fmt_iofile` without specifying `as_output=True` or `as_input=True` parameters, as these files should be displayed as generic files rather than specifically as output files.
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
Learnt from: johanneskoester
PR: snakemake/snakemake#3096
File: snakemake/dag.py:1943-1943
Timestamp: 2024-09-19T18:47:50.192Z
Learning: In the temp file deletion logic, temp files are only deleted after confirming that no other job (including non-checkpoint jobs) still needs them.
tests/test_immediate_submit_without_shared_fs/clustersubmit (14)
Learnt from: johanneskoester
PR: snakemake/snakemake#3204
File: tests/test_cores_cluster/qsub:1-6
Timestamp: 2024-11-12T20:22:54.184Z
Learning: In the Snakemake codebase, the `tests/test_cores_cluster/qsub` script is a dummy script for testing, and input validation and error handling are not required in such scripts.
Learnt from: lczech
PR: snakemake/snakemake#3113
File: snakemake/scheduler.py:912-914
Timestamp: 2024-10-04T16:12:18.927Z
Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
Learnt from: johanneskoester
PR: snakemake/snakemake#3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Learnt from: leoschwarz
PR: snakemake/snakemake#3176
File: tests/test_output_index.py:99-157
Timestamp: 2025-01-17T12:00:09.368Z
Learning: New test dependencies for Snakemake should be introduced in separate PRs rather than being added as part of feature or refactoring PRs.
Learnt from: johanneskoester
PR: snakemake/snakemake#2963
File: snakemake/scheduler.py:199-199
Timestamp: 2024-10-08T17:41:54.542Z
Learning: In the Snakemake project, do not suggest adding trailing commas in multi-line logging statements, as Black autoformatter handles this.
Learnt from: johanneskoester
PR: snakemake/snakemake#2963
File: snakemake/scheduler.py:199-199
Timestamp: 2024-08-27T14:31:21.449Z
Learning: In the Snakemake project, do not suggest adding trailing commas in multi-line logging statements, as Black autoformatter handles this.
Learnt from: landerlini
PR: snakemake/snakemake#3569
File: tests/test_immediate_submit_without_shared_fs/clustersubmit:0-0
Timestamp: 2025-06-19T13:58:18.447Z
Learning: When reviewing SQL schemas, user landerlini prefers "created_at" over "timestamp" for column naming and prefers to avoid DEFAULT CURRENT_TIMESTAMP in favor of manual timestamp insertion.
Learnt from: johanneskoester
PR: snakemake/snakemake#3014
File: snakemake/workflow.py:1250-1251
Timestamp: 2024-08-13T16:22:58.430Z
Learning: Avoid commenting on trailing commas in code reviews for the Snakemake repository as per the user's preference.
Learnt from: johanneskoester
PR: snakemake/snakemake#3014
File: snakemake/workflow.py:1250-1251
Timestamp: 2024-10-08T17:41:54.542Z
Learning: Avoid commenting on trailing commas in code reviews for the Snakemake repository as per the user's preference.
Learnt from: johanneskoester
PR: snakemake/snakemake#3148
File: snakemake/dag.py:1332-1336
Timestamp: 2024-11-12T12:08:20.342Z
Learning: In `snakemake/dag.py`, when code is outdated and will disappear upon resolving merge conflicts, avoid making code review suggestions on that code.
Learnt from: mbhall88
PR: snakemake/snakemake#3188
File: tests/test_script/scripts/test.sh:4-4
Timestamp: 2024-11-07T00:32:44.137Z
Learning: In test scripts within the Snakemake project, concise code is preferred over verbose error handling when simplicity suffices.
Learnt from: johanneskoester
PR: snakemake/snakemake#2960
File: CHANGELOG.md:6-6
Timestamp: 2024-10-08T17:41:54.542Z
Learning: Do not review pull requests generated by Release Please for the Snakemake project.
Learnt from: johanneskoester
PR: snakemake/snakemake#2960
File: CHANGELOG.md:6-6
Timestamp: 2024-08-13T11:05:23.821Z
Learning: Do not review pull requests generated by Release Please for the Snakemake project.
Learnt from: johanneskoester
PR: snakemake/snakemake#3171
File: snakemake/cli.py:106-145
Timestamp: 2024-10-29T09:26:26.636Z
Learning: Avoid adding input validation or error handling that unnecessarily complicates the code in the `snakemake` codebase, especially when the cases handled don't make sense.
🔇 Additional comments (1)
tests/test_immediate_submit_without_shared_fs/clustersubmit (1)
1-161: LGTM! Well-implemented test script.The cluster submission script effectively simulates a cluster-managed DAG execution environment. The implementation addresses all the key requirements:
- Proper database schema with constraints and appropriate column types
- Asynchronous job execution with proper error handling
- SQLite concurrency handling with locks and WAL mode
- Efficient file reading and dependency management
- Comprehensive logging for debugging
The script serves its purpose well as a test utility for validating immediate submission without shared filesystems.
johanneskoester
left a comment
There was a problem hiding this comment.
Thanks a lot, in particular also for the well designed test case!
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
src/snakemake/cli.py(1 hunks)src/snakemake/jobs.py(1 hunks)src/snakemake/scheduling/job_scheduler.py(4 hunks)tests/tests.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/snakemake/jobs.py
- src/snakemake/cli.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Files:
tests/tests.pysrc/snakemake/scheduling/job_scheduler.py
🧠 Learnings (2)
📚 Learning: 2024-10-06T14:09:54.370Z
Learnt from: johanneskoester
PR: snakemake/snakemake#3117
File: tests/test_wrapper/Snakefile:11-11
Timestamp: 2024-10-06T14:09:54.370Z
Learning: Changes made within test cases, such as in `tests/test_wrapper/Snakefile`, are for testing purposes and do not require updates to the project documentation.
Applied to files:
tests/tests.py
📚 Learning: 2024-11-12T20:22:54.184Z
Learnt from: johanneskoester
PR: snakemake/snakemake#3204
File: tests/test_cores_cluster/qsub:1-6
Timestamp: 2024-11-12T20:22:54.184Z
Learning: In the Snakemake codebase, the `tests/test_cores_cluster/qsub` script is a dummy script for testing, and input validation and error handling are not required in such scripts.
Applied to files:
tests/tests.py
🧬 Code graph analysis (2)
tests/tests.py (1)
tests/common.py (2)
run(152-503)dpath(33-36)
src/snakemake/scheduling/job_scheduler.py (5)
src/snakemake/jobs.py (5)
dag(1387-1388)jobs(1362-1363)jobs(1366-1367)postprocess(1180-1274)postprocess(1600-1660)src/snakemake/workflow.py (8)
remote_exec(449-450)local_exec(434-438)dag(511-512)run(2314-2315)dryrun(410-414)touch(417-422)exec_mode(457-458)linemaps(503-504)src/snakemake/common/__init__.py (1)
async_run(91-108)src/snakemake/dag.py (5)
retrieve_storage_inputs(401-454)jobs(610-612)postprocess(1866-1910)handle_log(984-989)handle_touch(909-916)tests/common.py (1)
run(152-503)
🔇 Additional comments (4)
src/snakemake/scheduling/job_scheduler.py (4)
341-365: Immediate-submit: correct to skip local rules and warnSkipping
local_runjobsunder--immediate-submitmatches the intended semantics and avoids premature local execution; the warning is helpful.
369-369: No functional changeLooks fine.
401-403: Minor: local alias for immediate_submitGood micro-optimization/readability; avoids repeated property lookups.
410-416: Postprocess now ignores missing outputs under immediate-submitPassing
ignore_missing_outputfor DEFAULT/SUBPROCESS/REMOTE and in the error path aligns postprocessing with submit-only behavior. This prevents premature failures before the external executor produces outputs.Also applies to: 417-423, 426-434, 439-439
🤖 I have created a release *beep* *boop* --- ## [9.11.0](v9.10.1...v9.11.0) (2025-09-05) ### Features * add landing page to the report containing custom metadata defined with a YTE yaml template ([#3452](#3452)) ([c754d80](c754d80)) * Issue a warning when unsupported characters are used in wildcard names ([#1587](#1587)) ([fa57355](fa57355)) ### Bug Fixes * avoid checking output files in immediate-submit mode ([#3569](#3569)) ([58c42cf](58c42cf)) * clarify --keep-going flag help text to distinguish runtime vs DAG construction errors ([#3705](#3705)) ([a3a8ef4](a3a8ef4)) * enable docstring assignment in `use rule ... with:` block ([#3710](#3710)) ([2191962](2191962)) * fix missing attribute error in greedy scheduler settings when using touch, dryrun or immediate submit. ([6471004](6471004)) * only trigger script with CODE label ([#3707](#3707)) ([2d01f8c](2d01f8c)) * parser.py incorrectly parsing multiline f-strings ([#3701](#3701)) ([06ece76](06ece76)) * parsing ordinary yaml strings ([#3506](#3506)) ([ddf334e](ddf334e)) * remove temp files when using checkpoints ([#3716](#3716)) ([5ff3e20](5ff3e20)) * Restore python rules changes triggering reruns. (GH: [#3213](#3213)) ([#3485](#3485)) ([2f663be](2f663be)) * unit testing ([#3680](#3680)) ([06ba7e6](06ba7e6)) * use queue_input_wait_time when updating queue input jobs ([#3712](#3712)) ([a945a19](a945a19)) ### Documentation * output files within output directories is an error ([#2848](#2848)) ([#2913](#2913)) ([37272e5](37272e5)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
…3569) <!--Add a description of your PR here--> Disable checking output files when snakemake is run with flag `--immediate-submit`. Notes: * The flag `--notemp` should be additionally indicated (the CLI suggests for it) * The flag `--not-retrieve-storage` should be indicated to avoid snakemake to try retrieving remote data that may not yet exist * `localrule`s break the DAG because their dependency pattern can not be handled by cluster logics. A warning is printed and the scheduler stops processing jobs requiring their output. ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - **New Features** - Added support for immediate job submission mode that skips local job execution and adjusts output handling accordingly. - Introduced a new test workflow and cluster submission script to validate execution without a shared filesystem. - **Bug Fixes** - Improved handling of missing outputs during job postprocessing to avoid waiting for intentionally ignored files. - **Documentation** - Expanded help text for the immediate submission option with additional usage guidance. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Lucio Anderlini <Lucio.Anderlini@fi.infn.it> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Johannes Koester <johannes.koester@uni-due.de>
🤖 I have created a release *beep* *boop* --- ## [9.11.0](snakemake/snakemake@v9.10.1...v9.11.0) (2025-09-05) ### Features * add landing page to the report containing custom metadata defined with a YTE yaml template ([snakemake#3452](snakemake#3452)) ([c754d80](snakemake@c754d80)) * Issue a warning when unsupported characters are used in wildcard names ([snakemake#1587](snakemake#1587)) ([fa57355](snakemake@fa57355)) ### Bug Fixes * avoid checking output files in immediate-submit mode ([snakemake#3569](snakemake#3569)) ([58c42cf](snakemake@58c42cf)) * clarify --keep-going flag help text to distinguish runtime vs DAG construction errors ([snakemake#3705](snakemake#3705)) ([a3a8ef4](snakemake@a3a8ef4)) * enable docstring assignment in `use rule ... with:` block ([snakemake#3710](snakemake#3710)) ([2191962](snakemake@2191962)) * fix missing attribute error in greedy scheduler settings when using touch, dryrun or immediate submit. ([6471004](snakemake@6471004)) * only trigger script with CODE label ([snakemake#3707](snakemake#3707)) ([2d01f8c](snakemake@2d01f8c)) * parser.py incorrectly parsing multiline f-strings ([snakemake#3701](snakemake#3701)) ([06ece76](snakemake@06ece76)) * parsing ordinary yaml strings ([snakemake#3506](snakemake#3506)) ([ddf334e](snakemake@ddf334e)) * remove temp files when using checkpoints ([snakemake#3716](snakemake#3716)) ([5ff3e20](snakemake@5ff3e20)) * Restore python rules changes triggering reruns. (GH: [snakemake#3213](snakemake#3213)) ([snakemake#3485](snakemake#3485)) ([2f663be](snakemake@2f663be)) * unit testing ([snakemake#3680](snakemake#3680)) ([06ba7e6](snakemake@06ba7e6)) * use queue_input_wait_time when updating queue input jobs ([snakemake#3712](snakemake#3712)) ([a945a19](snakemake@a945a19)) ### Documentation * output files within output directories is an error ([snakemake#2848](snakemake#2848)) ([snakemake#2913](snakemake#2913)) ([37272e5](snakemake@37272e5)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Disable checking output files when snakemake is run with flag
--immediate-submit.Notes:
--notempshould be additionally indicated (the CLI suggests for it)--not-retrieve-storageshould be indicated to avoid snakemake to try retrieving remote data that may not yet existlocalrules break the DAG because their dependency pattern can not be handled by cluster logics. A warning is printed and the scheduler stops processing jobs requiring their output.QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests