-
Notifications
You must be signed in to change notification settings - Fork 634
Description
Snakemake version
9.5.1
Describe the bug
The recently introduced fix in 9.5.1:
causes crashing of remote jobs when schemas of the config files containes default values for properties. These default variables are not loaded anymore in the remote job and it will cause an KeyError of the config. I do not get the error in 9.5.0
Logs
E.g. that is what I get
KeyError in file "/data/cephfs-1/home/users/schubacm_c/work/projects/MPRAsnakeflow/workflow/rules/common.smk", line 70:
'skip_version_check'
File "/data/cephfs-1/home/users/schubacm_c/work/projects/MPRAsnakeflow/workflow/rules/common.smk", line 94, in <module>
Minimal example
E.g. My config files are versionized so that I can check if the config file fits to the snakemake pipeline version. By default (within the schema) it is always on (skipping is set to false), so that the user does not need to specify this config.
Start of my config schema:
type: object
properties:
version:
description: Version of MPRAsnakeflow
type: string
pattern: ^(\d+(\.\d+)?(\.\d+)?)|(0\.\d+(\.\d+)?)$
skip_version_check:
description: Skip version check
type: boolean
default: falseNow in my code I always check at the beginning of my general common.smk I first validate the config and then I check if the version of the config fits to the workflow version. But the key skip_version_check is not present anymore in remote executed jobs and they fail. My code in the common.smk:
from snakemake.utils import validate
validate(config, schema="../schemas/config.schema.yaml")
import re
# Regular expression to match the first two digits with the dot in the middle
pattern_major_version = r"^(\d+)"
pattern_development_version = r"^(0(\.\d+)?)"
def check_version(pattern, version, config_version):
# Search for the pattern in the string
match_version = re.search(pattern, version)
match_config = re.search(pattern, config_version)
# Check if a match is found and print the result
if match_version and match_config:
if match_version.group(1) != match_config.group(1):
raise ValueError(
f"\033[38;2;255;165;0mVersion mismatch: MPRAsnakeflow version is {version}, but config version is {config_version}\033[0m"
)
if not config["skip_version_check"]:
check_version(pattern_development_version, version, config["version"])
check_version(pattern_major_version, version, config["version"])Running the workflow works fine first because the key skip_version_check is available. But when the first rule is executed remotely (SLURM cluster) and fails because of KeyError skip_version_check.
Additional context
I found it always a smart way to define defaults of variables within a schema. This seems not to be possible anymore.