Skip to content

Feat/clade display names#1065

Merged
rneher merged 8 commits intomasterfrom
feat/clade-display-names
May 12, 2023
Merged

Feat/clade display names#1065
rneher merged 8 commits intomasterfrom
feat/clade-display-names

Conversation

@rneher
Copy link
Copy Markdown
Member

@rneher rneher commented Apr 22, 2023

this PR adds

  • a file defining mapping between bare year-letter nextstrain clades and display-names
  • a script that re-writes the clades file using the new display names
  • removes the previously duplicated clades file with the alternative names.

@rneher
Copy link
Copy Markdown
Member Author

rneher commented Apr 26, 2023

@rneher
Copy link
Copy Markdown
Member Author

rneher commented Apr 26, 2023

this worked as expected. other than for the problem that the colors aren't generated properly since the display names are not in the metadata.

@rneher
Copy link
Copy Markdown
Member Author

rneher commented Apr 26, 2023

image

rneher added 3 commits April 26, 2023 21:22
Previously, clade colors were restricted to terms in the metadata.
This is now changed to using the metadata which ensures that
final clades (after modification through renaming to display clades)
are used for color selection. We previously relied here on metadata,
but display names are not present there.
@rneher
Copy link
Copy Markdown
Member Author

rneher commented Apr 27, 2023

this now works happily:
image

@rneher rneher requested a review from a team April 27, 2023 14:55
@@ -0,0 +1,32 @@
19B: 19B
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason there's no 19A: 19A?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

none that I can think of ;)

@corneliusroemer
Copy link
Copy Markdown
Member

Once this is merged I'll have to adjust the Nextclade data workflows, so I'll see what you come up with here. If we wanted to maintain WHO names for colouring, would it make sense to define a defaults/clade_who_mapping.yml? Or shall I maintain this separately in the workflow? That would make it a little less convenient for updates as currently all the changes for new clades are in this ncov repo.

I remember we talked about moving the ground truth to a separate "clade definition" repo - but we can also just have it in here, there's no real circularity.

Comment thread defaults/parameters.yaml Outdated
lat_longs: "defaults/lat_longs.tsv"
description: "defaults/description.md"
clades: "defaults/clades.tsv"
clades: "defaults/clades_nextstrain.tsv"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should keep the default file named as clades.tsv, since this is what we refer to in the docs (e.g., labeling clades, the workflow reference guide, etc.) and call the derived file for the workflow something different?

If we do rename the default, we just need to update references to it in the docs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had named it in analogy to clades_who, but the latter isn't really part of this workflow. I am fine with renaming it.

 * add missing 19A to name translation table
 * rename clades file to match documentation
 * canonicize clade name in emergence date table
@rneher rneher requested a review from huddlej May 7, 2023 11:00
@rneher
Copy link
Copy Markdown
Member Author

rneher commented May 9, 2023

@huddlej, did this address your concern?

@huddlej
Copy link
Copy Markdown
Contributor

huddlej commented May 11, 2023

@rneher Yeah, this looks great. Thank you!

@rneher
Copy link
Copy Markdown
Member Author

rneher commented May 12, 2023

@rneher rneher merged commit 69f91c5 into master May 12, 2023
@rneher rneher deleted the feat/clade-display-names branch May 12, 2023 13:34
corneliusroemer added a commit to neherlab/nextclade_data_workflows that referenced this pull request May 22, 2023
See nextstrain/ncov#1065
Clade legacy is hence deprecated
We now have a new column "display clade name" which is a combination
of Nextstrain clade and Pango lineage, e.g. "23B (XBB.1.16)"
corneliusroemer added a commit to nextstrain/ncov-ingest that referenced this pull request Jun 20, 2023
Starting with dataset release `2023-06-16` Nextclade no longer outputs a `clades_legacy` column into the tsv
( This was implemented in neherlab/nextclade_data_workflows#42
And triggered by a refactor in ncov of how we annotate clades nextstrain/ncov#1065 )
So as not to break downstream workflows that rely on ingest output `metadata.tsv` having `clade_legacy`, this PR adds a `clade_legacy` column to `metadata.tsv`
The values are defined as a simple mapping from `clade_nextstrain` (year-letter, e.g. 22F) to `clade_legacy` in `defaults/clade-legacy-mapping.yml`
This file lives in ingest for now to make this PR work without requiring changes to `ncov`.
corneliusroemer added a commit to nextstrain/ncov-ingest that referenced this pull request Jun 20, 2023
Starting with dataset release `2023-06-16` Nextclade no longer outputs a `clades_legacy` column into the tsv
( This was implemented in neherlab/nextclade_data_workflows#42
And triggered by a refactor in ncov of how we annotate clades nextstrain/ncov#1065 )
So as not to break downstream workflows that rely on ingest output `metadata.tsv` having `clade_legacy`, this PR adds a `clade_legacy` column to `metadata.tsv`
The values are defined as a simple mapping from `clade_nextstrain` (year-letter, e.g. 22F) to `clade_legacy` in `defaults/clade-legacy-mapping.yml`
This file lives in ingest for now to make this PR work without requiring changes to `ncov`.
joverlee521 added a commit that referenced this pull request Jan 29, 2024
The clade labels were updated in #1065.
Update the clade label so that the `assign_rbd_levels` script can find
the correct basal clade.

I had considered pulling this value out as a parameter in the config YAML,
but the original commit message¹ implies that this shouldn't not be
configurable.

¹ fb5f44e
joverlee521 added a commit that referenced this pull request Jan 29, 2024
The clade labels were updated in #1065.
Update the clade label so that the `assign_rbd_levels` script can find
the correct basal clade.

I had considered pulling this value out as a parameter in the config YAML,
but the original commit message¹ implies that this shouldn't not be
configurable.

¹ fb5f44e
joverlee521 added a commit that referenced this pull request Jan 30, 2024
The clade labels were updated in #1065.
Update the clade label so that the `assign_rbd_levels` script can find
the correct basal clade.

I had considered pulling this value out as a parameter in the config YAML,
but the original commit message¹ implies that this should _not_ be
configurable.

¹ fb5f44e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects

Development

Successfully merging this pull request may close these issues.

3 participants