feat: shell function calls inside of the 'run' directive now use conda, container, or envmodules specifications#2289
Conversation
|
Just to clarify: the example above is simplified, and I realize it can also be accomplished by defining a separate params function. However, I frequently have rules that have some python control flow (if/else, for) mixed with shell commands, for which this is not the case. |
|
Kudos, SonarCloud Quality Gate passed! |
|
📝 WalkthroughWalkthroughThis pull request updates documentation and internal workflow validations in Snakemake to permit the use of conda environments and apptainer containers with the Changes
Sequence Diagram(s)sequenceDiagram
participant U as User/CLI
participant W as Workflow Engine
participant R as Rule (run directive)
participant S as Python Script
U->>W: Trigger workflow execution
W->>W: Validate rule (check template_engine for conda/container usage)
W->>R: Execute rule with run directive
R->>S: Invoke conda run script (e.g., test_script_run.py)
S->>R: Write output (e.g., to test.txt)
R->>W: Return execution result
Possibly related PRs
Suggested reviewers
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (2)
tests/test_conda_run/Snakefile (1)
1-9: LGTM! Well-structured test case for conda integration.The rule effectively validates the new conda directive functionality for shell commands within run blocks. The test design is clean and purposeful, using numpy version verification as a practical test case.
Consider these enhancements to make the test more robust:
rule random_python_conda_script: + # Test case for conda directive integration with shell commands in run blocks output: "version.txt" conda: "test_python_env.yaml" params: script=srcdir('test_script.py') + log: + "logs/version.log" run: - shell("python {params.script}") + shell("python {params.script} 2> {log}") + # Verify the output file exists and is not empty + shell("test -s {output}")snakemake/workflow.py (1)
1807-1807: Improve boolean comparison styleThe comparison
ruleinfo.container_img != Falsecan be simplified to justruleinfo.container_imgfor a more Pythonic boolean check.- if not ruleinfo.template_engine and ruleinfo.container_img != False: + if not ruleinfo.template_engine and ruleinfo.container_img:🧰 Tools
🪛 Ruff
1807-1807: Avoid inequality comparisons to
False; useif ruleinfo.container_img:for truth checksReplace with
ruleinfo.container_img(E712)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
docs/snakefiles/deployment.rst(2 hunks)snakemake/workflow.py(2 hunks)tests/test_conda_run/Snakefile(1 hunks)tests/test_conda_run/expected-results/version.txt(1 hunks)tests/test_conda_run/test_python_env.yaml(1 hunks)tests/test_conda_run/test_script.py(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- tests/test_conda_run/expected-results/version.txt
- tests/test_conda_run/test_script.py
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/workflow.py (1)
Pattern **/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self argument of methods.
Do not suggest type annotation of the cls argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return statement.
🪛 Ruff
snakemake/workflow.py
1807-1807: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks
Replace with ruleinfo.container_img
(E712)
🔇 Additional comments (5)
tests/test_conda_run/test_python_env.yaml (1)
1-3: LGTM! Channel configuration follows best practices.
The channel configuration is well-structured with conda-forge as the primary channel and defaults as the fallback, which is the recommended approach for stability and package availability.
tests/test_conda_run/Snakefile (1)
8-9: Verify the test script's existence and conda environment configuration.
Let's ensure the test dependencies are properly set up.
✅ Verification successful
Let me verify the content of the test script to ensure it aligns with the Snakefile's execution.
All test dependencies are properly configured and aligned ✓
The verification confirms that:
- The test script
test_script.pyexists and contains valid Python code that writes numpy version toversion.txt - The conda environment file
test_python_env.yamlexists with proper configuration:- Specifies numpy 1.21.4 and Python <3.10
- Uses conda-forge and defaults channels
- The Snakefile correctly references both the conda environment and the test script
- The script path in the Snakefile uses
srcdir()for proper path resolution
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify test dependencies and configuration
# Check if test script exists
echo "Checking test script..."
test -f "tests/test_conda_run/test_script.py" && echo "Found test_script.py" || echo "Missing test_script.py"
# Check conda environment file
echo -e "\nChecking conda environment file..."
test -f "tests/test_conda_run/test_python_env.yaml" && echo "Found test_python_env.yaml" || echo "Missing test_python_env.yaml"
# Display conda environment contents if file exists
if [ -f "tests/test_conda_run/test_python_env.yaml" ]; then
echo -e "\nConda environment contents:"
cat "tests/test_conda_run/test_python_env.yaml"
fi
Length of output: 616
Script:
#!/bin/bash
# Check the content of test_script.py
echo "Content of test_script.py:"
cat tests/test_conda_run/test_script.py
# Check the complete Snakefile for context
echo -e "\nContent of Snakefile:"
cat tests/test_conda_run/Snakefile
Length of output: 492
docs/snakefiles/deployment.rst (1)
288-293: LGTM! Clear documentation of conda directive behavior.
The added documentation clearly explains how the conda directive interacts with the run directive, which is crucial for users to understand the scope of environment isolation.
snakemake/workflow.py (2)
1783-1787: Validation logic for conda environments has been updated
The validation now specifically mentions that conda environments are not allowed with template_engine directives, while being allowed with run, shell, script, notebook, or wrapper directives. This change aligns with the PR's objective of enhancing conda environment integration.
1797-1801: Validation logic for container images has been updated
Similar to the conda environment validation, the container image validation now specifically mentions that Singularity directives are not allowed with template_engine directives, while being allowed with run, shell, script, notebook, or wrapper directives.
|
That would be very intuitive and more convenient. Hope this PR get reviewed and approved soon. |
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
docs/snakefiles/deployment.rst (1)
288-293: LGTM! Consider minor rewording for clarity.The documentation clearly explains the behavior of conda environments with the
rundirective. For improved clarity, consider rewording "from the within run script" to "from within the run script".- The ``conda`` directive for rules with a ``run`` directive therefore only affects ``shell`` function calls that are executed from the within ``run`` script. + The ``conda`` directive for rules with a ``run`` directive therefore only affects ``shell`` function calls that are executed from within the ``run`` script.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
docs/snakefiles/deployment.rst(2 hunks)
🔇 Additional comments (1)
docs/snakefiles/deployment.rst (1)
460-463: LGTM! Documentation maintains consistency.
The explanation of container behavior with the run directive mirrors the conda section, maintaining consistency throughout the documentation.
…e of srcdir. Change file names to prevent test collection conflict.
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake/workflow.py (1)
1826-1826: Code style improvement opportunityThe condition for applying a global container image has been updated to work with the
rundirective, but there's a minor code style issue.Following Python style best practices, replace the inequality comparison to
Falsewith a truth check:- if not ruleinfo.template_engine and ruleinfo.container_img != False: + if not ruleinfo.template_engine and ruleinfo.container_img is not False:This is more idiomatic Python and addresses the static analysis warning.
🧰 Tools
🪛 Ruff (0.8.2)
1826-1826: Avoid inequality comparisons to
False; useif ruleinfo.container_img:for truth checksReplace with
ruleinfo.container_img(E712)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/snakefiles/deployment.rst(2 hunks)snakemake/workflow.py(2 hunks)tests/tests_using_conda.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/snakefiles/deployment.rst
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
tests/tests_using_conda.pysnakemake/workflow.py
🪛 Ruff (0.8.2)
snakemake/workflow.py
1826-1826: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks
Replace with ruleinfo.container_img
(E712)
⏰ Context from checks skipped due to timeout of 90000ms (40)
- GitHub Check: testing (10, 3.12, dash)
- GitHub Check: testing (10, 3.12, bash)
- GitHub Check: testing (10, 3.11, bash)
- GitHub Check: testing (9, 3.12, dash)
- GitHub Check: testing (9, 3.12, bash)
- GitHub Check: testing (9, 3.11, bash)
- GitHub Check: testing (8, 3.12, dash)
- GitHub Check: testing (8, 3.12, bash)
- GitHub Check: testing (8, 3.11, bash)
- GitHub Check: testing (7, 3.12, dash)
- GitHub Check: testing (7, 3.12, bash)
- GitHub Check: testing (7, 3.11, bash)
- GitHub Check: testing (6, 3.12, dash)
- GitHub Check: testing (6, 3.12, bash)
- GitHub Check: testing (6, 3.11, bash)
- GitHub Check: testing (5, 3.12, dash)
- GitHub Check: testing (5, 3.12, bash)
- GitHub Check: testing (5, 3.11, bash)
- GitHub Check: testing (4, 3.12, dash)
- GitHub Check: testing (4, 3.12, bash)
- GitHub Check: testing (4, 3.11, bash)
- GitHub Check: testing (3, 3.12, dash)
- GitHub Check: testing (3, 3.12, bash)
- GitHub Check: testing-windows (10)
- GitHub Check: testing (3, 3.11, bash)
- GitHub Check: testing-windows (9)
- GitHub Check: testing-windows (8)
- GitHub Check: testing (2, 3.12, dash)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (2, 3.12, bash)
- GitHub Check: testing-windows (6)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (2, 3.11, bash)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (1, 3.12, dash)
- GitHub Check: testing-windows (3)
- GitHub Check: testing (1, 3.12, bash)
- GitHub Check: testing-windows (2)
- GitHub Check: testing (1, 3.11, bash)
- GitHub Check: testing-windows (1)
🔇 Additional comments (3)
tests/tests_using_conda.py (1)
309-311: Test added for conda functionality with run directiveThe new test case correctly follows the pattern of existing tests and will verify that conda environments can be used with shell function calls from 'run' scripts, aligning with the PR objectives.
snakemake/workflow.py (2)
1802-1807: Error message updated for conda environment validationThe condition for validating conda environments has been improved to check if a rule is a template engine instead of checking for specific directives. This allows conda environments to be used with the
rundirective, which was the main objective of this PR.
1816-1821: Error message updated for container image validationSimilar to the conda environment validation, the condition for validating container images (Singularity) has been improved to check if a rule is a template engine. This maintains consistency with the conda environment changes and also allows container images to be used with the
rundirective.
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake/workflow.py (1)
1812-1813: Use explicit boolean comparison for clarity
Comparing toFalsewith!= Falsecan be misleading. Consider explicitly checking with “is not False” or verifying truthiness:- if not ruleinfo.template_engine and ruleinfo.container_img != False: + if not ruleinfo.template_engine and ruleinfo.container_img is not False:🧰 Tools
🪛 Ruff (0.8.2)
1812-1812: Avoid inequality comparisons to
False; useif ruleinfo.container_img:for truth checksReplace with
ruleinfo.container_img(E712)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake/workflow.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake/workflow.py
🪛 Ruff (0.8.2)
snakemake/workflow.py
1812-1812: Avoid inequality comparisons to False; use if ruleinfo.container_img: for truth checks
Replace with ruleinfo.container_img
(E712)
⏰ Context from checks skipped due to timeout of 90000ms (18)
- GitHub Check: testing (10, 3.12, dash)
- GitHub Check: testing (10, 3.12, bash)
- GitHub Check: testing (10, 3.11, bash)
- GitHub Check: testing-windows (10)
- GitHub Check: testing-windows (9)
- GitHub Check: testing-windows (8)
- GitHub Check: testing (2, 3.12, bash)
- GitHub Check: testing-windows (7)
- GitHub Check: testing (2, 3.11, bash)
- GitHub Check: testing-windows (6)
- GitHub Check: testing-windows (5)
- GitHub Check: testing (1, 3.12, dash)
- GitHub Check: testing-windows (4)
- GitHub Check: testing (1, 3.12, bash)
- GitHub Check: testing-windows (3)
- GitHub Check: testing (1, 3.11, bash)
- GitHub Check: testing-windows (2)
- GitHub Check: testing-windows (1)
🔇 Additional comments (4)
snakemake/workflow.py (4)
1779-1787: New validation function looks great!
This helper method centralizes the validation logic and makes the code more maintainable. No issues noted.
1794-1794: Straightforward check for envmodules usage
Calling the helper function here correctly enforces consistency with template rules.
1800-1800: Thorough conda environment check
Leveraging the new helper function here efficiently prevents incompatible directives.
1808-1808: Consistent container usage check
This call similarly helps maintain the deployment constraints.
🤖 I have created a release *beep* *boop* --- ## [8.30.0](v8.29.3...v8.30.0) (2025-03-11) ### Features * Add extra input size properties ([#2424](#2424)) ([359ae2e](359ae2e)) * shell function calls inside of the 'run' directive now use conda, container, or envmodules specifications ([#2289](#2289)) ([0193e34](0193e34)) * xonsh support for script directive ([#3310](#3310)) ([d1c369b](d1c369b)) ### Bug Fixes * include unit test templates in `setup.py` ([#3362](#3362)) ([b47252c](b47252c)) ### Documentation * clearly explain report rendering to ZIP archive ([#3357](#3357)) ([948e8fb](948e8fb)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…a, container, or envmodules specifications (snakemake#2289) ### Description Rules that combine python and shell code can be quite useful, e.g.: ``` run: x = parse_input_file(input) shell('command --param {x}') ``` Conda integration for the shell commands in such a rule is currently only supported like this: ``` shell('command --param {x}', conda_env='named_preexisting_environment') ``` Named pre-existing conda environments make a workflow however more difficult to distribute and could lead to reproducibility issues. Better would be to use conda specification files (e.g. 'envs/my_env.yaml'). This is however not supported by 'shell' function calls. Implementing it would not be straightforward, because these shell-calls are only executed during the processing of a rule, and snakemake therefore would not know at startup which conda environments would be needed. Creating the environments as-needed during rule execution would lead to issues with rules that are executed in parallel. This issue has bee noted before snakemake#163, snakemake#1619. One possible solution is to put all code within e.g. a separate python script which in turn executes 'command'. Then everything can run within the conda environment. This however requires quite a bit of boilerplate code to substitute what snakemake normally does (e.g. argument parsing, formatting of the shell command based on params/resources/input/output files, subprocess machinery). Another alternative is to implement this with a bash script and use the 'shell' directive, but this is usually not as readable as a python solution, particularly if one needs to do more complex operations, e.g. parse an input file. This pull request therefore enables the following rule structure: ``` conda: 'envs/my_env.yaml' run: x = parse_input_file(input) shell('command --param {x}') ``` This will perform 'command --param {x}' within the specified conda environment. Note that it will not perform the whole run block within the conda environment, but only shell function calls that are executed from the run block. A note specifying this is added to the documentation. For symmetry, it makes this change also for the 'container' directive. No large code changes are required, as the functionality is already present, only a check has to be removed. One limitation of this solution is that each shell call executed from the run script will be performed in the same conda environment (it is though possible to execute without the conda environment using `shell('command', conda_env=None)`). A future implementation could make this more flexible by enabling something like this: ``` conda: env1='envs/env1.yaml', env2='envs/env2.yaml' run: if <condition>: shell('command1 {input}', conda_env=conda_env.env1) else: shell('command2 {input}', conda_env=conda_env.env2) ``` ### QC <!-- Make sure that you can tick the boxes below. --> * [x] The PR contains a test case for the changes or the changes are already covered by an existing test case. * [x] The documentation (`docs/`) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Documentation** - Revised guidance now allows using conda environments and container setups with an additional workflow directive, enhancing clarity on environment execution. - **New Features** - Expanded support for environment and container configurations increases workflow flexibility. - Introduced a new rule for executing a Python script within a conda environment. - **Tests** - Added new test cases to validate the execution of conda environments and updated existing tests to reflect changes in script paths and outputs. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Johannes Köster <johannes.koester@uni-due.de>
🤖 I have created a release *beep* *boop* --- ## [8.30.0](snakemake/snakemake@v8.29.3...v8.30.0) (2025-03-11) ### Features * Add extra input size properties ([snakemake#2424](snakemake#2424)) ([359ae2e](snakemake@359ae2e)) * shell function calls inside of the 'run' directive now use conda, container, or envmodules specifications ([snakemake#2289](snakemake#2289)) ([0193e34](snakemake@0193e34)) * xonsh support for script directive ([snakemake#3310](snakemake#3310)) ([d1c369b](snakemake@d1c369b)) ### Bug Fixes * include unit test templates in `setup.py` ([snakemake#3362](snakemake#3362)) ([b47252c](snakemake@b47252c)) ### Documentation * clearly explain report rendering to ZIP archive ([snakemake#3357](snakemake#3357)) ([948e8fb](snakemake@948e8fb)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>












Description
Rules that combine python and shell code can be quite useful, e.g.:
Conda integration for the shell commands in such a rule is currently only supported like this:
Named pre-existing conda environments make a workflow however more difficult to distribute and could lead to reproducibility issues.
Better would be to use conda specification files (e.g. 'envs/my_env.yaml'). This is however not supported by 'shell' function calls. Implementing it would not be straightforward, because these shell-calls are only executed during the processing of a rule, and snakemake therefore would not know at startup which conda environments would be needed. Creating the environments as-needed during rule execution would lead to issues with rules that are executed in parallel.
This issue has bee noted before #163, #1619. One possible solution is to put all code within e.g. a separate python script which in turn executes 'command'. Then everything can run within the conda environment. This however requires quite a bit of boilerplate code to substitute what snakemake normally does (e.g. argument parsing, formatting of the shell command based on params/resources/input/output files, subprocess machinery). Another alternative is to implement this with a bash script and use the 'shell' directive, but this is usually not as readable as a python solution, particularly if one needs to do more complex operations, e.g. parse an input file.
This pull request therefore enables the following rule structure:
This will perform 'command --param {x}' within the specified conda environment. Note that it will not perform the whole run block within the conda environment, but only shell function calls that are executed from the run block. A note specifying this is added to the documentation. For symmetry, it makes this change also for the 'container' directive.
No large code changes are required, as the functionality is already present, only a check has to be removed.
One limitation of this solution is that each shell call executed from the run script will be performed in the same conda environment (it is though possible to execute without the conda environment using
shell('command', conda_env=None)).A future implementation could make this more flexible by enabling something like this:
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
Documentation
New Features
Tests