Skip to content

Improve variant calling from ampliconic PE data#33

Merged
mvdbeek merged 6 commits intogalaxyproject:mainfrom
wm75:artic-illumina-variation-update
Jun 5, 2021
Merged

Improve variant calling from ampliconic PE data#33
mvdbeek merged 6 commits intogalaxyproject:mainfrom
wm75:artic-illumina-variation-update

Conversation

@wm75
Copy link
Member

@wm75 wm75 commented May 20, 2021

Fixes weaknesses of the previous workflow discovered from COG-UK
tracking effort results on usegalaxy.* instances and from comparison to
https://www.veo-europe.eu/ results for the same data.

wm75 and others added 4 commits May 20, 2021 11:34
Fixes weaknesses of the previous workflow discovered from COG-UK
tracking effort results on usegalaxy.* instances and from comparison to
https://www.veo-europe.eu/ results for the same data.
Otherwise tabular files with variable number of columns are sniffed
as txt files.
---------
# Changelog

## [0.3] - 2021-05-19
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ---- is need to delineate entries for the release text.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... which was broken in planemo, but we should keep the formatting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, this was the idea:

In [12]: def changelog_in_repo(target_repository_path):
    ...:     changelog = []
    ...:     for path in os.listdir(target_repository_path):
    ...:         if 'changelog.md' in path.lower():
    ...:             header_seen = False
    ...:             header_chars = ('---', '===', '~~~')
    ...:             with(open(os.path.join(target_repository_path, path))) as changelog_fh:
    ...:                 for line in changelog_fh:
    ...:                     if line.startswith(header_chars):
    ...:                         if header_seen:
    ...:                             return "\n".join(changelog[:-1])
    ...:                         else:
    ...:                             header_seen = True
    ...:                     changelog.append(line.rstrip())
    ...:     return "\n".join(changelog)
    ...:

In [13]: print(changelog_in_repo('workflows/sars-cov-2-variant-calling/sars-cov-2-pe-illumina-artic-variant-calling/'))
0.3
---

This version brings a number of tweaks to the ivar-dependent steps of the
workflow. Together, these are expected to make variant allele frequency
calculations more precise, in general, and robust in the face of an increasing
number of variants at primer binding sites:

- Upgrade ivar from version 1.2.2 to 1.3.1
  This affects ivar trim and ivar removereads
- Use the newly introduced -f option of ivar trim to exclude read pairs from
  further analysis that extend beyond amplicon boundaries.
  This change should be benefitial for accurate AF calculations in general,
  but in particular for corrected AF values after removal of biased amplicons,
  where aberrant read pairs often represent a larger fraction of the remaining
  reads.
- Run ivar trim only after realignment and addition of indel qualities by
  lofeq. This should make sure that indels close to primer sequences are
  seen as read-internal events.
- Turn the lower and upper thresholds for variant AF that triggers readremoval
  into workflow input parameters and adjust their defaults to trigger read
  removal only in more obvious cases of non-fixed variants.
- Require a minimum depth of coverage for recalled variants after read removal
  of 20 to ensure reliable AF values.
  This change also prevents situations where variants are recalled successfully
  after read removal, but are later excluded from variant reports generated by
  the reporting workflow due to that workflow's min_dp_alt >= 10 filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's follow https://keepachangelog.com/en/0.3.0/ ... I will update the function that didn't work in the first place.

@mvdbeek mvdbeek merged commit 0bbc004 into galaxyproject:main Jun 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants