fix: checkpoints inside modules are overwritten#2843
fix: checkpoints inside modules are overwritten#2843fgvieira wants to merge 3 commits intosnakemake:mainfrom
Conversation
|
📝 WalkthroughWalkthroughThis pull request adds assertions to a test Snakefile and introduces a new function in a module-specific Snakefile. The assertions check that the function, defined in the module file and returning a constant value of 15, behaves as expected when invoked from two test contexts. In addition, a new expected results file is added, providing a reference output for validation. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as Test Runner
participant Test as Test Snakefile
participant Module as Module-Test Snakefile
Runner->>Test: Execute test for test_a
Test->>Module: Invoke some_func()
Module-->>Test: Return 15
Test->>Runner: Assert value equals 15
Runner->>Test: Execute test for test_b
Test->>Module: Invoke some_func()
Module-->>Test: Return 15
Test->>Runner: Assert value equals 15
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms (31)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (3)
tests/test_module_checkpoint/Snakefile (1)
1-36: Well-structured test design for module checkpoint verificationThe test setup effectively demonstrates the checkpoint bug by:
- Using identical modules with different prefixes and configs
- Ensuring independent execution paths through unique prefixes
- Verifying consistent behavior through assertions
This design should successfully validate that checkpoints maintain independence across module instances.
tests/test_module_checkpoint/module-test/Snakefile (2)
3-4: Add docstring to clarify function purposeThe function's purpose in testing module functionality should be documented for better maintainability.
def some_func(): + """Helper function that returns 15, used to verify module loading and execution.""" return 15
7-11: Consider adding output verificationWhile the workflow structure is good for testing module checkpoint behavior, consider adding output verification to ensure the correct paths are being used.
Add a verification step to the aggregate rule:
rule aggregate: input: aggregate_input output: "aggregated/{sample}.txt" + run: + # Verify input path matches expected module prefix + expected_prefix = "post" if wildcards.sample == "a" else "alt" + input_path = str(input[0]) + assert input_path.startswith(expected_prefix), f"Unexpected input path: {input_path}" + shell("touch {output}") - shell: - "touch {output}"Also applies to: 42-48
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
tests/test_module_checkpoint/Snakefile(1 hunks)tests/test_module_checkpoint/expected-results/a/alt/a.txt(1 hunks)tests/test_module_checkpoint/expected-results/a/somestep/a.txt(1 hunks)tests/test_module_checkpoint/expected-results/b/alt/b.txt(1 hunks)tests/test_module_checkpoint/expected-results/b/somestep/b.txt(1 hunks)tests/test_module_checkpoint/module-test/Snakefile(1 hunks)
✅ Files skipped from review due to trivial changes (4)
- tests/test_module_checkpoint/expected-results/a/alt/a.txt
- tests/test_module_checkpoint/expected-results/a/somestep/a.txt
- tests/test_module_checkpoint/expected-results/b/alt/b.txt
- tests/test_module_checkpoint/expected-results/b/somestep/b.txt
🔇 Additional comments (5)
tests/test_module_checkpoint/Snakefile (4)
1-1: LGTM: Shell configuration is appropriate
Setting bash as the shell executable ensures consistent behavior across different environments.
35-36: Verify the function definition and its expected behavior
The assertions check for consistent behavior of some_func() across modules. Let's verify its implementation.
#!/bin/bash
# Description: Verify some_func() definition and implementation
# Expected: Find function definition that returns 15
echo "Searching for some_func definition:"
rg "def\s+some_func" -A 5 "tests/test_module_checkpoint/module-test/Snakefile"28-32: Verify the existence of referenced aggregate rules
The all rule references test_a_aggregate and test_b_aggregate. Let's verify these rules exist in the module.
#!/bin/bash
# Description: Verify referenced rules exist in the module
# Expected: Find 'aggregate' rule definition in the module Snakefile
echo "Searching for aggregate rule definition:"
rg "rule\s+aggregate:" "tests/test_module_checkpoint/module-test/Snakefile"4-25: Verify the module Snakefile and its checkpoint implementation
The module configuration looks correct, but we should verify the referenced Snakefile and its checkpoint implementation.
tests/test_module_checkpoint/module-test/Snakefile (1)
12-18: Validate config value before writing to file
The checkpoint writes the raw config value to a file without validation. This could cause issues if the config contains special characters or is not a string.
Consider adding config validation:
checkpoint somestep:
output:
"somestep/{sample}.txt"
+ run:
+ # Ensure config is string-safe
+ config_str = str(config).replace('\n', ' ').strip()
+ with open(output[0], 'w') as f:
+ f.write(config_str)
- shell:
- # simulate some output vale
- "echo {config} > {output}"|
@fgvieira shouldn't the testcase fail? |
|
ping @fgvieira |
|
Sorry for the late reply. The test does not fail because both modules have the same rules (since it is the same |
|
@fgvieira Hello, may I push some commits based on your code? |
|
Sure! 😄 |
|
4fd665d to
7b62f8c
Compare
|
@fgvieira @matthewfeickert Hello, do you think this solution #3359 accepable? I just don't know how to merge that to this PR |
update of #2843, may merge to there <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a modular workflow with distinct processing stages and clear output checkpoints. - Added new modules and rules for enhanced workflow capabilities. - **Tests** - Expanded the test suite with additional module checkpoint coverage for enhanced workflow integrity. - **Refactor** - Optimized checkpoint management to improve modularity and stability in workflow execution. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: fgvieira <1151762+fgvieira@users.noreply.github.com> Co-authored-by: Johannes Koester <johannes.koester@uni-due.de>
|
@Hocnonsense seems to be working. Thanks! |
|
Cheers! Happy that I can help! |
update of snakemake#2843, may merge to there <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a modular workflow with distinct processing stages and clear output checkpoints. - Added new modules and rules for enhanced workflow capabilities. - **Tests** - Expanded the test suite with additional module checkpoint coverage for enhanced workflow integrity. - **Refactor** - Optimized checkpoint management to improve modularity and stability in workflow execution. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: fgvieira <1151762+fgvieira@users.noreply.github.com> Co-authored-by: Johannes Koester <johannes.koester@uni-due.de>




Description
When including two modules with checkpoints (or the same module twice with different prefixes), only the checkpoint of the last loaded module is used.
In the example here, the checkpoint
somestepis run twice the moduletest_b(under prefixb) while it should be run one for each prefix (as exemplified underexpected-results).QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit