Skip to content

checkpoint: pipeline finishing early #823

@nick-youngblut

Description

@nick-youngblut

I'm running snakemake 5.31.1 (bioconda)

When using checkpoint, the pipeline will often finish successfully without completing all steps or even generating all output files designated by the all rule.

For instance, here's the end of a recent run:

[Sun Jan  3 12:52:42 2021]
Finished job 4215.
2877 of 6702 steps (43%) done
Removing temporary output file /ebio/abt3_scratch/nyoungblut/LLPRIMER_101310211274/primers_raw/28/primers_126.txt.
[Sun Jan  3 12:52:43 2021]
Finished job 912.
2878 of 6702 steps (43%) done
Removing temporary output file /ebio/abt3_scratch/nyoungblut/LLPRIMER_101310211274/primers_raw/34/primers_191.txt.
[Sun Jan  3 12:52:43 2021]
Finished job 4515.
2879 of 6702 steps (43%) done
Removing temporary output file /ebio/abt3_scratch/nyoungblut/LLPRIMER_101310211274/primers_raw/28/primers_201.txt.
[Sun Jan  3 12:52:44 2021]
Finished job 1437.
2880 of 6702 steps (43%) done
Complete log: /ebio/abt3_projects/software/dev/ll_pipelines/llprimer/.snakemake/log/2021-01-03T115355.280837.snakemake.log

Pipeline complete! Creating report...

If I try to "finish" the run by re-running snakemake, snakemake will start from the beginning of the pipeline instead of picking up where it left off.

I don't have a minimal example as of yet. Maybe others have (or will soon) run into this problem and post a minimal example from a simpler pipeline than mine.

The only checkpoint in my pipeline generates a large number of tsv files, and the output for that checkpoint rule is a directory. My aggregation function is:

def aggregate_primer_info(wildcards):
    chk_out = checkpoints.clusters_core_genes.get(**wildcards).output[0]
    F = expand(cgp_dir + 'primers_final/{plen}/primers_{x}.tsv',
               plen = config['params']['cgp']['primerprospector']['primer_lengths'],
               x = glob_wildcards(os.path.join(chk_out, 'clusters_{i}.fasta')).i)
    return F

I don't see why snakemake is ending the run early, especially since the final, aggregated output files designated in the all rule were not created upon the early-finish.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions