Conversation
| printshellcmds: True | ||
| show-failed-logs: True | ||
| restart-times: 2 | ||
| stats: stats.json |
There was a problem hiding this comment.
Re: 4b2685c
(non-blocking)
No more stats files? That's a bummer 😞 I've been using Tom's vis tool for these files to inspect performance of augur filter calls. We can still get the same numbers from per-rule benchmark files, but it's been nice to have a summary across all rules.
There was a problem hiding this comment.
I didn't really look into it (hence the "WIP" commit), I was just seeing the minimal amount of work to get the workflow running. The migration guide, somewhat comically, says "Unsupported?" so... maybe it's supported in some capacity? Please feel free to dive in here.
There was a problem hiding this comment.
Thanks, nothing wrong with your changes. Just wanted to rant 😄
Here's some discussion on its removal: snakemake/snakemake#3523
The reports feature is a potential replacement, but I haven't used it.
There was a problem hiding this comment.
Since we are moving to v9 we should look into the reports feature, but from ~1 minute of scanning the docs it doesn't seem anywhere as nice. All in all we're not going to block the v9 upgrade on this.
There was a problem hiding this comment.
When I tried out the reports feature last year, it wasn't that great...
There was a problem hiding this comment.
For norovirus/dengue work, I roughly generated a python script (prompting chatgpt o4) to convert the "benchmarks" directory of timings to a "stats.json" file. Seems to work so far, although probably room for improvement.
./summarize-stats.py --benchmark-dir phylogenetic/benchmarks --stats stats.json
|
Triggered a GISAID 21L test run using, |
|
Added e115838 to check the Snakemake version based on today's dev chat. This is what the built-in error looks like |
The previous version of `path_or_url` was not compatible with Snakemake v8, which implemented a storage plugin architecture [1]. This is a complete rewrite of the approach and was originally written for zika [2]. See [2] for more commentary as well as the docs added to `remote_files.smk` Google Storage URIs are not supported due to authn complexity, as discussed in [3] and HTTP support is dropped as requested in [4]. The default of `keep_local` is now True. Since all (n=3) existing usages of `path_or_url` specify `keep_local=True` this is consistent with previous behaviour. [1] <https://snakemake.readthedocs.io/en/stable/snakefiles/storage.html> [2] <nextstrain/zika#89> [3] <nextstrain/zika#89 (comment)> [4] <nextstrain/zika#89 (comment)>
Using 8.0.0 as the minimum version because that is when the storage plugins were added. Even though we are bumping runtimes to Snakemake v9, we are not using any specific v9 features.
This printed the reason for each executed rule but was already deprecated and always true in v7 [1] and has been removed entirely in later snakemake versions. The upgrade path is clear [2]: "Deprecated: Drop it and don't worry about anything" Note: The same changes were applied to zika [3] [1] <https://snakemake.readthedocs.io/en/v7.22.0/executing/cli.html#output> [2] <https://snakemake.readthedocs.io/en/stable/getting_started/migration.html> [3] <nextstrain/zika#88>
The `--stats` CLI argument is no longer supported and according to <https://snakemake.readthedocs.io/en/stable/getting_started/migration.html> there is no suitable replacement. The reports feature looks along the same lines as stats, but doesn't seem as well suited for our purposes. See <#1180 (comment)> for more discussion around this.
The keyword storage now has a special meaning in Snakemake <https://snakemake.readthedocs.io/en/stable/snakefiles/storage.html>. Rename our existing usage of this variable to avoid conflicts.
As we now require snakemake v8 we can leverage the more convenient
`input.size_mb` functionality, which is functionally equivalent to
`input.size / 1024 / 1024` [1]
The `rule colors` wants the size of an individual input (not the sum of
all inputs) and the `_IOFile.size` function is now a coroutine. The more
widely used (by us) `InputFiles.size` function uses `async_run` to allow
coroutine calls so I think it's fine for us to do so explicitly to get
the size of an individual file. Other code paths end up at the same
point, such as `input.size_files_mb[list(input.keys()).index('metadata')]`
Closes #1114
[1] <https://github.com/snakemake/snakemake/blob/7f2606e453ba6bd83222ec3679178ae96bd142d2/src/snakemake/io/__init__.py#L1894-L1923>
[2] <https://github.com/snakemake/snakemake/blob/7f2606e453ba6bd83222ec3679178ae96bd142d2/src/snakemake/io/__init__.py#L622-L633>
With Snakemake v9 we observed the following error:
No validator found for JSON Schema version identifier 'http://json-schema.org/draft-06/schema'
Defaulting to validator for JSON Schema version 'https://json-schema.org/draft/2020-12/schema'
Note that schema file may not be validated correctly.
There's nothing salient in
<https://snakemake.readthedocs.io/en/stable/project_info/history.html>
indicating why this happened but it wasn't observed in v7. I read
through the migration guides to get us up to 2020-12
<https://json-schema.org/specification/migration> and didn't find any
changed needed, which matches with the validation of our schema using
the new schema defs.
e115838 to
38861d3
Compare
The code here is lifted directly from <nextstrain/ncov#1180> which itself was lifted from <#89>. See those PRs for in-depth discussions. The `remote_files.smk` file added here will be moved to the github.com/nextstrain/shared/ repo and vendored across pathogen repos, including this one. Assuming this doesn't change cf. the ncov implementation I think it's fine to have the file copy-pasted around a few repos before it arrives in the shared repo.
No longer needed since we are removing Google Cloud support in ncov/remove_files <nextstrain/ncov#1180>
No longer needed since we are removing Google Cloud support in ncov/remove_files <nextstrain/ncov#1180>
The code here is lifted directly from <nextstrain/ncov#1180> which itself was lifted from <#89>. See those PRs for in-depth discussions. The `remote_files.smk` file added here will be moved to the github.com/nextstrain/shared/ repo and vendored across pathogen repos, including this one. Assuming this doesn't change cf. the ncov implementation I think it's fine to have the file copy-pasted around a few repos before it arrives in the shared repo.
We are no longer using these internally [1] however we can support them again (using different code) in the future as needed [1] <https://bedfordlab.slack.com/archives/CSKMU6YUC/p1752630023070009>
If we wish to support this again in the future we can re-implement it in a snakemake v9 compatible way. Given the hardcoded paths in the config I'm confident this was only ever used by the neher lab and not external users.
38861d3 to
2665fb0
Compare
2665fb0 to
d5a0bcb
Compare
Started failing with 403 in CI https://github.com/nextstrain/ncov/actions/runs/16352499665/job/46202617554
I originally implemented this in <nextstrain/zika#89> and that's where the bulk of the discussion and review happened. That PR wasn't merged and instead it was added to ncov via <nextstrain/ncov#1180>. The code in this commit is copied directly from <https://github.com/nextstrain/ncov/blob/137ffc479a1bfe4e2c6ae1bbcd1b22e3afc5f8dd/workflow/snakemake_rules/remote_files.smk> Closes #56
I originally implemented this in <nextstrain/zika#89> and that's where the bulk of the discussion and review happened. That PR wasn't merged and instead it was added to ncov via <nextstrain/ncov#1180>. The code in this commit is copied directly from <https://github.com/nextstrain/ncov/blob/137ffc479a1bfe4e2c6ae1bbcd1b22e3afc5f8dd/workflow/snakemake_rules/remote_files.smk> Closes #56
I originally implemented this in <nextstrain/zika#89> and that's where the bulk of the discussion and review happened. That PR wasn't merged and instead it was added to ncov via <nextstrain/ncov#1180>. The code in this commit is copied directly from <https://github.com/nextstrain/ncov/blob/137ffc479a1bfe4e2c6ae1bbcd1b22e3afc5f8dd/workflow/snakemake_rules/remote_files.smk> with some changes to comments in the code as applicable. Closes #56
The code here is lifted directly from <nextstrain/ncov#1180> which itself was lifted from <#89>. See those PRs for in-depth discussions. The `remote_files.smk` file added here will be moved to the github.com/nextstrain/shared/ repo and vendored across pathogen repos, including this one. Assuming this doesn't change cf. the ncov implementation I think it's fine to have the file copy-pasted around a few repos before it arrives in the shared repo.
Needs a lot of testing and some more polish, but it doesn't seem like it'll take much to support Snakemake v9. I tested using Snakemake v9.6.2 via a single modified ncov-open run:
snakemake --cores 3 -p --profile nextstrain_profiles/nextstrain-open -f auspice/ncov_open_global_1m.jsonwhich now completes successfully.Please add comments / commits to this PR if you can!
Closes #1179
May close #1114, but I only tested on Snakemake v9