Conversation
📝 WalkthroughWalkthroughThe changes in this pull request involve updates to the command-line interface and job selection logic within the Snakemake framework. The Changes
Possibly related PRs
Suggested reviewers
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (1)
snakemake/cli.py (1)
821-826: Clarify the help text for the--schedulerargumentThe phrase "increasing sub-optimal solutions" in the help text may be misleading. Consider rephrasing it to "producing less optimal scheduling solutions" or "resulting in less optimal scheduling" to better convey that these algorithms may produce schedules that are not as optimal but are generated more quickly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- snakemake/cli.py (1 hunks)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
snakemake/cli.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🪛 Ruff
snakemake/scheduler.py
801-801: Undefined name
res(F821)
|
Thanks @fgvieira for preparing the PR! To give some context: We are having a use case with many (millions) of rather small jobs, where each job execution is ~1sec, but the scheduler takes a full minute to select the next batch of jobs to be run, and is hence the main limiting factor for throughput. The algorithms proposed here solve this in two distinct ways:
|
There was a problem hiding this comment.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (1)
snakemake/scheduler.py (1)
698-701: Remove unused function and approve changes.The changes to the
job_selector_greedymethod are minor and don't alter the core logic. However, there's an unused function that can be removed.Remove the unused
calc_rewardfunction:- def calc_reward(): - return [c_j * y_j for c_j, y_j in zip(c, y)]This will clean up the code and remove unnecessary elements.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (1)
snakemake/scheduler.py (1)
130-136: LGTM: Job selector assignment logic is correct and clear.The implementation correctly assigns the appropriate job selector method based on the scheduler setting. This allows for flexible scheduling strategies.
|
@johanneskoester and Snakemake team, one thing that is still a bit unclear to me is the computation of the rewards. As far as I could make sense of the existing implementation here, there are three values |
|
Thanks a lot, very interesting! I will try to assess this as soon as possible. This comes timely as I anyway had the plan to finalize the plugin APIs of Snakemake soon, including scheduler plugins and deployment plugins. Maybe I manage to do that next week. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
snakemake/scheduler.py (2)
811-846: LGTM: Simple and effective implementation of the naive job selector.The
job_selector_naivemethod is well-implemented, providing a straightforward approach to job selection. It correctly checks resource constraints before adding jobs to the solution.Consider optimizing the resource checking loop.
The current implementation uses a separate loop for checking resources and another for updating them. This can be optimized for better performance.
Consider applying this optimization:
- # Check resources - exhausted_some_res = False - for i in range(len(max_glob_res)): - # Check if exceeds max - if used_res[i] + job_res[i] > max_glob_res[i]: - exhausted_some_res = True - break - - # Check if limits exceeded - if exhausted_some_res: - break - else: - # Update total resources - for i in range(len(max_glob_res)): - used_res[i] += job_res[i] - # Add job - solution.add(job) + # Check resources and update if possible + if all(used_res[i] + job_res[i] <= max_glob_res[i] for i in range(len(max_glob_res))): + # Update total resources + used_res = [used + job for used, job in zip(used_res, job_res)] + # Add job + solution.add(job) + else: + breakThis change combines the resource checking and updating into a single loop, potentially improving performance for large numbers of jobs.
Line range hint
698-742: LGTM: Improved efficiency with asynchronous reward calculation.The use of
async_run(self.jobs_rewards(jobs))is a good improvement, potentially enhancing the efficiency of reward calculation for multiple jobs.Remove unused
calc_rewardfunction.The
calc_rewardfunction is defined but never used in this method. It's best to remove unused code to maintain cleanliness and prevent confusion.Apply this diff to remove the unused function:
- def calc_reward(): - return [c_j * y_j for c_j, y_j in zip(c, y)]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
📓 Learnings (1)
snakemake/scheduler.py (1)
Learnt from: lczech PR: snakemake/snakemake#3113 File: snakemake/scheduler.py:912-914 Timestamp: 2024-10-04T16:12:18.927Z Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
🪛 Ruff
snakemake/scheduler.py
797-797: Undefined name
res(F821)
797-797: Undefined name
res(F821)
🔇 Additional comments (2)
snakemake/scheduler.py (2)
130-137: LGTM: Job selector assignment logic looks good.The new job selector assignment logic is clear, concise, and covers all possible scheduler options. It's a good addition that allows for flexible scheduling strategies.
913-915: LGTM: Efficient asynchronous implementation of jobs_rewards.The
jobs_rewardsmethod is well-implemented, providing an efficient way to calculate rewards for multiple jobs asynchronously. The use of a list comprehension withawaitis a good choice, allowing for concurrent reward calculations without the overhead ofasyncio.gather, which aligns with the previous feedback about immediate result needs and potential overhead concerns.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
snakemake/cli.py (1)
821-826: Correct grammatical errors and improve clarity in help text for--schedulerargumentThe help text contains grammatical errors and typos:
- The phrase "increasing sub-optimal solutions" should be rephrased to "obtaining sub-optimal solutions" for clarity.
- "100s of thousand of similar jobs" should be corrected to "hundreds of thousands of similar jobs".
Apply this diff to improve the help text:
"The greedy, greedier and naive algorithms aim at reducing the scheduler runtime at the expense of" -increasing sub-optimal solutions (but useful if dealing with 100s of thousand of similar jobs)." +obtaining sub-optimal solutions (but useful when dealing with hundreds of thousands of similar jobs)."
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/cli.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/cli.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
snakemake/scheduler.py (2)
698-701: Remove unusedcalc_rewardfunction.The
calc_rewardfunction defined insidejob_selector_greedyis not used within the method. Consider removing this unused function to improve code clarity.
910-912: LGTM: Efficient asynchronous implementation ofjobs_rewards.The
jobs_rewardsmethod efficiently calculates rewards for multiple jobs asynchronously. This implementation is suitable for most scenarios.However, for very large numbers of jobs, consider using
asyncio.gatherto potentially improve performance:async def jobs_rewards(self, jobs): return await asyncio.gather(*(self.job_reward(job) for job in jobs))This change could allow for more efficient parallel execution of reward calculations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
📓 Learnings (1)
snakemake/scheduler.py (1)
Learnt from: lczech PR: snakemake/snakemake#3113 File: snakemake/scheduler.py:912-914 Timestamp: 2024-10-04T16:12:18.927Z Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
🔇 Additional comments (4)
snakemake/scheduler.py (4)
130-137: LGTM: Flexible job selector assignment implemented.The implementation of different job selection strategies based on the
schedulersetting is well-structured and provides good flexibility. The addition of "naive" and "greedier" schedulers expands the options for job selection, which can be beneficial for different workload scenarios.
743-809: LGTM: Efficient implementation of greedier job selector.The
job_selector_greediermethod is well-implemented, using a heap for efficient job selection based on rewards. The greediness calculation allows for flexibility in job selection, and resource constraints are properly checked before adding jobs to the solution.
811-843: LGTM: Simple and correct implementation of naive job selector.The
job_selector_naivemethod provides a straightforward implementation for job selection. It iterates through jobs, adding them to the solution until resources are exhausted. This simple approach can be useful in scenarios where a more complex selection strategy is not necessary.
910-912: LGTM: Type annotations added tojob_rewardmethod.The addition of type annotations to the
job_rewardmethod improves code readability and maintainability. This change aligns with best practices for Python development.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
snakemake/scheduler.py (3)
698-701: Consider removing unused functioncalc_reward.The
calc_rewardfunction is defined but not used in the provided code segment. It might be leftover code from a previous implementation. Consider removing it to improve code clarity.
743-811: LGTM: Efficient implementation of the greedier job selector.The
job_selector_greediermethod is well-implemented, using a heap for efficient job selection based on rewards. The greediness calculation allows for flexibility in job selection, and resource constraints are properly checked before adding jobs to the solution.Consider adding comments to explain the key steps of the algorithm, such as the greediness calculation and the heap usage, to improve code readability and maintainability.
812-845: LGTM: Simple and correct implementation of the naive job selector.The
job_selector_naivemethod is implemented correctly, iterating through jobs and adding them to the solution until resources are exhausted. This approach is appropriately simple for a naive selector, not considering job rewards or priorities.Consider adding a brief comment explaining that this method doesn't take job priorities into account, which might be useful for users or maintainers of the code.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
📓 Learnings (1)
snakemake/scheduler.py (1)
Learnt from: lczech PR: snakemake/snakemake#3113 File: snakemake/scheduler.py:912-914 Timestamp: 2024-10-04T16:12:18.927Z Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
🔇 Additional comments (3)
snakemake/scheduler.py (3)
130-137: LGTM: Job selector assignment looks good.The implementation correctly assigns different job selector methods based on the scheduler setting. The structure is clear and ensures that only one selector is chosen.
740-742: LGTM: Proper update of available resources.The
update_available_resourcesmethod is correctly called with the solution set at the end of thejob_selector_greedymethod. This ensures that the available resources are accurately tracked after job selection.
912-914: LGTM: Efficient asynchronous implementation of jobs_rewards.The
jobs_rewardsmethod is implemented correctly as an asynchronous method. It efficiently uses a list comprehension to calculate rewards for all jobs. This implementation is concise and makes good use of Python's asynchronous features.
|
I'm sure everyone has a different use-case/preference, but in the context of our HPC, I would much rather just have all the jobs submitted quickly rather than submitted in an optimal order but slowly. The |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
snakemake/scheduler.py (2)
698-701: Remove unused functioncalc_reward.The
calc_rewardfunction is defined but not used in the provided code segment. It appears to be a leftover from previous code. Consider removing it to improve code clarity and maintainability.Apply this diff to remove the unused function:
- def calc_reward(): - return [c_j * y_j for c_j, y_j in zip(c, y)]
743-774: LGTM: Efficient implementation of greedy job selection.The
_job_selector_greediermethod is well-implemented, using an efficient approach to select jobs until resources are exhausted. The use ofany()for resource checking is a good Pythonic approach.A minor suggestion for improved readability:
Consider renaming
global_restomax_global_resorglobal_res_limitsto more clearly indicate that these are the maximum resource capacities.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
📓 Learnings (1)
snakemake/scheduler.py (1)
Learnt from: lczech PR: snakemake/snakemake#3113 File: snakemake/scheduler.py:912-914 Timestamp: 2024-10-04T16:12:18.927Z Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
🔇 Additional comments (5)
snakemake/scheduler.py (5)
130-137: LGTM: Clear and correct job selector assignment.The implementation of job selector assignment based on the scheduler setting is well-structured and correct. It provides a clear way to choose between different job selection strategies.
740-742: LGTM: Correct update of available resources.The call to
update_available_resourcesis correctly placed after job selection in thejob_selector_greedymethod. This ensures that resource availability is accurately maintained for subsequent scheduling rounds.
819-841: LGTM: Simple and efficient naive job selection.The
job_selector_naivemethod is well-implemented, providing a straightforward approach to job selection. The use ofislicefor selecting a subset of jobs is efficient, and the reuse of_job_selector_greedierfor final selection promotes code reuse.
908-910: LGTM: Efficient asynchronous calculation of job rewards.The
jobs_rewardsmethod is well-implemented, using asynchronous programming effectively to calculate rewards for multiple jobs. The use of a list comprehension with await makes the code concise and readable.
Line range hint
130-910: Overall assessment: Well-implemented job scheduling enhancements.The changes to the
JobSchedulerclass introduce new job selection strategies (naive and greedier) while improving the existing ones. The implementations are generally correct, efficient, and make good use of Python's features and asynchronous programming. The code is well-structured and promotes reusability.Minor suggestions for improvement include:
- Removing an unused function in the greedy selector.
- Considering a minor optimization in the greedier selector using
heapq.nlargest().- Improving variable naming for clarity in some parts.
These changes significantly enhance the flexibility and efficiency of job scheduling in Snakemake.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
snakemake/scheduler.py (1)
775-817: LGTM: Efficient implementation of the greedier job selector.The
job_selector_greediermethod is well-implemented, using a heap to efficiently select the most rewarding jobs. The calculation ofnprovides a good balance between considering all jobs and limiting to available resources.A potential optimization:
Consider using
heapq.nlargest()instead of manually popping elements from the heap. This could simplify the code and potentially be more efficient:- max_sorted = [heapq.heappop(heap)[1] for i in range(n)] + max_sorted = [job for _, job in heapq.nlargest(len(heap), heap)]This change eliminates the need for manual heap operations while achieving the same result.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/scheduler.py (4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/scheduler.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
📓 Learnings (1)
snakemake/scheduler.py (1)
Learnt from: lczech PR: snakemake/snakemake#3113 File: snakemake/scheduler.py:912-914 Timestamp: 2024-10-04T16:12:18.927Z Learning: In `snakemake/scheduler.py`, avoid suggesting the use of `asyncio.gather` in the `jobs_rewards` method due to overhead concerns and the need for immediate results.
🔇 Additional comments (6)
snakemake/scheduler.py (6)
130-137: LGTM: Job selector assignment logic is correct and clear.The implementation correctly assigns the appropriate job selector based on the
schedulersetting. This allows for easy extension of scheduling algorithms in the future.
743-773: LGTM: Correct implementation of the greedy job selection algorithm.The
_job_selector_greediermethod efficiently selects jobs while respecting resource constraints. The implementation is clear and straightforward.
819-841: LGTM: Simple and effective implementation of the naive job selector.The
job_selector_naivemethod provides a straightforward approach to job selection. The use ofisliceto limit the number of jobs considered is an efficient way to implement the greediness parameter.
Line range hint
843-850: LGTM: Correct implementation of resource update.The
update_available_resourcesmethod accurately updates the available resources after job selection. The special handling of_job_countis appropriate.
Line range hint
878-907: LGTM: Improved job reward calculation.The
job_rewardmethod now properly handles different workflow modes (touch, dryrun, immediate submit) and calculates rewards based on relevant factors (priority, temp file size, input size). This should lead to more effective job scheduling.
908-910: LGTM: Simple implementation of jobs_rewards method.The
jobs_rewardsmethod correctly calculates rewards for multiple jobs. The implementation is clear and straightforward.
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (2)
snakemake/cli.py (2)
817-822: Expanded scheduler options and improved help text.The changes to the
--schedulerargument are well-implemented and provide more flexibility to users. Here are the key improvements:
- Two new scheduler options have been added: "greedier" and "naive".
- The default scheduler is now set based on the availability of the recommended LP solver (COIN_CMD).
- The help text has been updated to explain the differences between the scheduler options.
These changes enhance the functionality and user-friendliness of the Snakemake CLI.
Consider adding brief descriptions of the new "greedier" and "naive" schedulers in the help text to give users a better understanding of when to use each option. For example:
help=( "Specifies if jobs are selected by solving an ILP, a greedy, a greedier, or a naive algorithm. " "The ILP scheduler aims to reduce runtime and HDD usage by best possible use of resources. " "The greedy scheduler balances optimization and speed. " "The greedier scheduler prioritizes speed over optimal resource usage. " "The naive scheduler is the fastest but may lead to suboptimal resource utilization. " "Choose based on your workflow complexity and performance requirements." ),
Line range hint
823-844: New WMS monitoring arguments added.Two new arguments have been added to support workflow management system (WMS) monitoring:
--wms-monitor: Allows specifying an IP and port for a WMS to monitor Snakemake execution.--wms-monitor-arg: Enables passing additional key-value pair arguments to the WMS endpoint.These additions enhance Snakemake's integration capabilities with external monitoring systems. The implementation is clean and well-documented.
Consider adding a brief example in the help text for
--wms-monitor-argto illustrate its usage. For instance:help=( "If the workflow management service accepts extra arguments, provide " "them in key value pairs with --wms-monitor-arg. For example: " "--wms-monitor-arg id=12345 status=running" ),This would make it clearer to users how to format multiple arguments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- snakemake/cli.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/cli.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (1)
snakemake/cli.py (1)
Line range hint
1-844: Overall assessment: Well-implemented enhancements to Snakemake CLI.The changes made to
snakemake/cli.pysignificantly improve the functionality and flexibility of the Snakemake command-line interface:
- Expanded scheduler options with the addition of "greedier" and "naive" algorithms.
- Improved scheduler selection logic based on available LP solvers.
- Added WMS monitoring capabilities with
--wms-monitorand--wms-monitor-arg.These enhancements provide users with more control over job scheduling and better integration with external monitoring systems. The implementation is clean, well-documented, and maintains consistency with the existing codebase.
While no major issues were found, consider implementing the suggested minor improvements to the help text for even better user experience.
|
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake/cli.py (1)
879-883: Clear explanation of scheduler optionsThe help text has been updated to explain the differences between the scheduler options, highlighting that the ILP scheduler optimizes resource usage while the greedy, greedier, and naive algorithms prioritize reducing scheduler runtime, potentially at the cost of solution optimality.
Consider enhancing the help text to include specific performance characteristics mentioned in the PR - such as the 'greedier' scheduler being 250x faster than 'greedy', and that 'naive' operates almost instantaneously. This would help users make an informed choice based on their specific use case.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake/cli.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake/cli.py
⏰ Context from checks skipped due to timeout of 90000ms (40)
- GitHub Check: testing (10, 3.12, dash)
- GitHub Check: testing (10, 3.12, bash)
- GitHub Check: testing (10, 3.11, bash)
- GitHub Check: testing (9, 3.12, dash)
- GitHub Check: testing (9, 3.12, bash)
- GitHub Check: testing (9, 3.11, bash)
- GitHub Check: testing (8, 3.12, dash)
- GitHub Check: testing (8, 3.12, bash)
- GitHub Check: testing (8, 3.11, bash)
- GitHub Check: testing (7, 3.12, dash)
- GitHub Check: testing (7, 3.12, bash)
- GitHub Check: testing (7, 3.11, bash)
- GitHub Check: testing (6, 3.12, dash)
- GitHub Check: testing (6, 3.12, bash)
- GitHub Check: testing (6, 3.11, bash)
- GitHub Check: testing (5, 3.12, dash)
- GitHub Check: testing (5, 3.12, bash)
- GitHub Check: testing (5, 3.11, bash)
- GitHub Check: testing (4, 3.12, dash)
- GitHub Check: testing (4, 3.12, bash)
- GitHub Check: testing (4, 3.11, bash)
- GitHub Check: testing (3, 3.12, dash)
- GitHub Check: testing (3, 3.12, bash)
- GitHub Check: testing-windows (10)
- GitHub Check: testing (3, 3.11, bash)
- GitHub Check: testing-windows (9)
- GitHub Check: testing (2, 3.12, dash)
- GitHub Check: testing-windows (8)
- GitHub Check: testing (2, 3.12, bash)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (2, 3.11, bash)
- GitHub Check: testing-windows (6)
- GitHub Check: testing (1, 3.12, dash)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (1, 3.12, bash)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (1, 3.11, bash)
- GitHub Check: testing-windows (3)
- GitHub Check: testing-windows (2)
- GitHub Check: testing-windows (1)
🔇 Additional comments (3)
snakemake/cli.py (3)
877-877: Updates to scheduler choices to include new optionsThe scheduler choices have been expanded to include the new 'greedier' and 'naive' options alongside the existing 'ilp' and 'greedy' options. This aligns with the PR objective to introduce more efficient schedulers for large job batches.
886-893: New monitoring functionality with --wms-monitorAdded a new command-line argument to enable workflow monitoring by allowing specification of an IP and port for a workflow management system. The help text properly explains the requirement for an authorization token.
894-905: Support for additional WMS argumentsAdded a new command-line argument to provide extra key-value pairs to the workflow management service. The help text gives a clear example of how to use this option.
|
@fgvieira the scheduler plugin interface is released and can be used with snakemake 9.10: https://github.com/snakemake/snakemake-interface-scheduler-plugins |
|
I've been a bit busy, but will try to turn these schedulers as plugins. |
|
@johanneskoester managed to make a first version of the scheduler plugin at fgvieira/snakemake-scheduler-plugin-firstfit. Let me know if you have any suggestions. |
|
Closing as this has been re-implemented as a plugin snakemake-scheduler-plugin-firstfit |



Similarly to #3112, the scheduler can be quite slow. In this PR, me and @lczech have implemented two new schedulers, a
greedierand anaive. For a simulated case of 500k ready jobs:The
greedieris over 250x faster, while thenaiveis practically instantaneous:As for the results, the
greediergives results similar to the currentgreedyimplementation, while thenaiveis close but sub-optimal:Simulation code
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
greedierandnaive.--wms-monitorargument for specifying workflow management system details.--wms-monitor-argfor providing additional parameters to the workflow management service.Enhancements
Bug Fixes