Skip to content

Update default masking from end to 200 bases#939

Merged
trvrb merged 2 commits intomasterfrom
mask-from-end
Apr 29, 2022
Merged

Update default masking from end to 200 bases#939
trvrb merged 2 commits intomasterfrom
mask-from-end

Conversation

@trvrb
Copy link
Copy Markdown
Member

@trvrb trvrb commented Apr 29, 2022

Description of proposed changes

Update default mask parameters to mask 200 bases from the end of the genome rather than the existing 50. This was necessary because there is a large deletion in this region in circulating 21L viruses. This deletion is causing problems with alignment and the resulting mis-alignment appears as excess mutations in the tree.

The issue can be currently seen at https://nextstrain.org/ncov/gisaid/global?c=gt-nuc_29765,29766,29767,29768,29760&m=div.

mutations

Testing

Testing has been fairly extensive with @corneliusroemer's https://nextstrain.org/groups/neherlab builds using https://github.com/neherlab/ncov-simple.

Release checklist

  • Update docs/src/reference/change_log.md in this pull request to document these changes by the date they were added.

trvrb added 2 commits April 29, 2022 13:40
Update default mask parameters to mask 200 bases from the end of the genome rather than the existing 50. This was necessary because there is a large deletion in this region in circulating 21L viruses. This deletion is causing problems with alignment and the resulting mis-alignment appears as excess mutations in the tree.
@trvrb trvrb merged commit 9164487 into master Apr 29, 2022
@trvrb trvrb deleted the mask-from-end branch April 29, 2022 20:51
@emmahodcroft
Copy link
Copy Markdown
Member

Worth noting that if/when Nextalign can handle this deletion better, this masking could be reduced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants