Skip to content

Update SARS-CoV-2 PE variation workflow#94

Merged
mvdbeek merged 12 commits intogalaxyproject:mainfrom
wm75:allele-frequency-fix-attempt2
Feb 11, 2022
Merged

Update SARS-CoV-2 PE variation workflow#94
mvdbeek merged 12 commits intogalaxyproject:mainfrom
wm75:allele-frequency-fix-attempt2

Conversation

@wm75
Copy link
Member

@wm75 wm75 commented Feb 8, 2022

No description provided.

wm75 added 12 commits February 5, 2022 00:01
Filtering relevant primer binding site variants now uses unbiased allele
frequencies recalculated from (DP4[2] + DP4[3]) / DP instead of relying
on the lofreq-provided value.
The lofreq-calculated AF is still what's reported in the final VCF output, but
the corresponding header INFO line gets rewritten to warn about this
fact.
This changeset also updates fastp to its latest (and likely faster) version.
This changeset makes the amplicon bias correctionmore robust and better
interoperable with the reporting workflow.
The first and second (after amplicon removal) round of variant calling are now
carried out with identical lofreq parameter settings. bcftools annotate
is then used to carry over the bias-corrected call stats to the variant
calls obtained in the first round. At the same time both variant call
lists are filtered with (by default) identical DP and DP_ALT filters as
in the reporting workflow, and stats of filter-passing variants from the first
round that fail to pass after the second round of calling are
transferred back to the first bcftools annotate output.
Together this makes sure that no initially called variant gets lost as a
consequence of amplicon bias correction and that no initially
filter-passing variant gets filtered out after correction. The
AmpliconBias INFO flag is used to mark all such variants for which amplicon
bias correction was not performed.
Updates bwa-mem, lofreq call and multiqc to their latest versions, fixes
the input connection to bcftools annotate broken in previous commit, and
adds back a lost WF output label to multiqc.
bcftools annotate shuffles VCF INFO field elements and adds two more time
stamped header lines.
@mvdbeek
Copy link
Member

mvdbeek commented Feb 11, 2022

Ready to merge @wm75 ?

@wm75
Copy link
Member Author

wm75 commented Feb 11, 2022

Yes, ready @mvdbeek :)
At least one more PR (updating the Reporting WF) will be coming today, but this one's an improvement on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants