feat: switch from toposort to graphlib#3109
Conversation
|
Warning Rate limit exceeded@fgvieira has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 57 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe changes in this pull request involve updates to the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
snakemake/dag.py (1)
2750-2757: Avoid shadowing built-in functionsortedby renaming the variableThe variable
sortedin yourgraphlib_toposortfunction shadows Python's built-insorted()function, which can lead to confusion and potential errors. It's best practice to avoid using names of built-in functions for variable names.Consider renaming the variable
sortedto something likesorted_layersortopo_order.Apply this diff:
- sorted = list() + topo_order = list() ... - sorted.append(ready) + topo_order.append(ready) ... - return sorted + return topo_order
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- setup.cfg (0 hunks)
- snakemake/dag.py (2 hunks)
- test-environment.yml (0 hunks)
💤 Files with no reviewable changes (2)
- setup.cfg
- test-environment.yml
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/dag.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Can you make the graphlib_toposort outside of the function(I'm not too fond of inlining) and add a small test to ensure the right format is given? |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (3)
tests/tests.py (1)
2270-2271: LGTM! Consider adding a brief docstring.The new test function
test_toposort()is well-structured and consistent with other tests in the file. It correctly uses therun()function with appropriate parameters for testing the topological sort functionality.Consider adding a brief docstring to explain the purpose of this test, for example:
def test_toposort(): """Test the topological sort functionality in dryrun mode.""" run(dpath("test_toposort"), check_results=False, executor="dryrun")snakemake/dag.py (2)
80-93: LGTM! Consider adding a type hint for clarity.The
toposortfunction is well-implemented and efficiently usesgraphlib.TopologicalSorter. It correctly groups tasks that can be executed in parallel.Consider adding a type hint for the
graphparameter to improve clarity:def toposort(graph: Dict[Any, Set[Any]]):
Line range hint
2793-2818: LGTM! Consider extracting the pipe job sorting logic for improved readability.The updated
toposortedmethod correctly uses the newtoposortfunction and handles pipe dependencies well. The implementation for sorting pipe jobs within each layer is correct but complex.To improve readability, consider extracting the pipe job sorting logic into a separate method:
def sort_pipe_jobs(group): return chain.from_iterable( self.toposort( { job: {dep for dep in self._dependencies[job] if dep in group} for job in group } ) ) # Then in the main method: sorted_layer.extend(sort_pipe_jobs(group))This refactoring would make the main method easier to read and understand.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
- snakemake/dag.py (3 hunks)
- tests/test_toposort/Snakefile (1 hunks)
- tests/tests.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
snakemake/dag.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.tests/tests.py (1)
Pattern
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
🔇 Additional comments (2)
tests/test_toposort/Snakefile (1)
1-3: LGTM: Import and graph definition look good.The import statement and graph definition are clear and concise. The graph represents a simple but effective test case for the
toposortfunction.snakemake/dag.py (1)
Line range hint
1-2818: Overall, the changes look good and improve the DAG functionality.The implementation of the new
toposortfunction and the updates to thetoposortedmethod are well-done. These changes enhance the topological sorting functionality and correctly handle pipe dependencies and parallel execution. The code is generally clean and efficient, with only minor suggestions for improvement in terms of type hinting and code organization.
|
🤖 I have created a release *beep* *boop* --- ## [8.22.0](v8.21.0...v8.22.0) (2024-10-13) ### Features * switch from toposort to graphlib ([#3109](#3109)) ([91e875d](91e875d)) ### Bug Fixes * configfile `group` and `group-components` were not being registered ([#3135](#3135)) ([4397c7d](4397c7d)) * remove paramiko dependency as issue has been fixed ([#3110](#3110)) ([1b43250](1b43250)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>



Use
graphlibinstead oftoposortfor topological sorting (following #2134).QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
Bug Fixes
toposortdependency to streamline package requirements.Chores