export v2: Add --metadata-columns option#1384
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1384 +/- ##
==========================================
+ Coverage 66.71% 66.96% +0.24%
==========================================
Files 69 69
Lines 7322 7341 +19
Branches 1799 1804 +5
==========================================
+ Hits 4885 4916 +31
+ Misses 2169 2154 -15
- Partials 268 271 +3 ☔ View full report in Codecov by Sentry. |
|
Nice one Jover - the overall aim of this work is something I've wanted for a long time. I haven't tested the code, just skimmed through.
Here do you mean visible in the modal when (shift-)clicking on a branch/tip? It'd be good to surface them as sidebar filtering options too, but that'll take an Auspice change I think. My understanding of the linked issue (and from discussions with others) is that datasets are adding unnecessary colorings in order to expose metadata as filtering options. |
Yes, I mean they will only be visible on the modal on click.
Yup, I think this will require changes in Auspice. My understanding is that currently filters must be defined in the Auspice JSON under |
victorlin
left a comment
There was a problem hiding this comment.
Functionality is great! Suggestions below are non-blocking (except changelog request).
6b1dbf6 to
4bf28ec
Compare
jameshadfield
left a comment
There was a problem hiding this comment.
Allows users to specify additional metadata columns to export as node attributes that are not specified as coloring options via the existing --color-by-metadata option or the Auspice config JSON. This allows the metadata columns to be visible in the tree in Auspice without polluting the color-by options.
This behaviour is already possible by specifying the (TSV metadata column names) in the auspice config's filters array. There are side effects beyond those implemented in this PR, as each key in filters will be shown as a (collapsed) section in the footer, but the (big) upside is that the metadata is searchable.
I want to change the general behaviour here a bit, but I think we are better to sketch the big direction out first before merging any PRs. I'm envisioning something like the following, but it's not been completely thought through:
- Auspice uses all
node_attrsas sidebar filtering options (and maybe make branch labels searchable too, but that's complicated). - The
filtersarray (in the dataset JSON) is reserved for the items listed in the footer. I see this functionality as less important now that we have sidebar filtering. - We have some way of exporting metadata (TSV) fields without them being colourings. Maybe that's this PR?
But maybe just doing (1) is enough? Thoughts?
| help="delimiters to accept when reading a metadata file. Only one delimiter will be inferred.") | ||
| optional_inputs.add_argument('--metadata-id-columns', default=DEFAULT_ID_COLUMNS, nargs="+", | ||
| help="names of possible metadata columns containing identifier information, ordered by priority. Only one ID column will be inferred.") | ||
| optional_inputs.add_argument('--metadata-columns', nargs="+", |
There was a problem hiding this comment.
Any functionality we support via command line arguments should be available via the auspice-config JSON as well, as the former is a subset of the latter's capabilities.
There was a problem hiding this comment.
Hmm, this flag relates to the metadata input/export but not really Auspice control settings. Not sure this fits in the auspice-config JSON?
There was a problem hiding this comment.
I see it as controlling what data is exported, similar to filters & colours. The (original) intention when we started adding command line arguments for options in the config JSON was for auspice-config to be all you need to control what data the JSON contains.
There was a problem hiding this comment.
Okay, that makes sense. I'll move up the flag to be part of the config args group too then.
There was a problem hiding this comment.
Auspice config schema update and changes to support the new property added in bdb2b2c
I think (1) is definitely needed. I think this will require some design on filtering for continuous value fields? On (2), I think we can even deprecate the footer filters. I definitely only use the sidebar filtering and have not looked at footer filters in a while. I think even if we implement (1), we still need some way to support (3). I think it would be nice to fully detangle exported metadata from coloring options. I would want to be more explicit about what metadata fields are exported instead of letting it stay a side effect of other Auspice configurations. |
Allows users to specify additional metadata columns to export as node attributes that are not specified as coloring options via the existing `--color-by-metadata` option or the Auspice config JSON. This allows the metadata columns to be visible in the tree in Auspice without polluting the color-by options.
Suggested by @victorlin in review.
4bf28ec to
1b83e78
Compare
|
Rebased to fix merge conflicts in changelog. |
|
@jameshadfield since (1) was implemented in nextstrain/auspice#1743, do you still have feedback here? |
My interpretation of the deprecation here was to deprecate the auspice UI, but keep the augur export filters setting and use this as the way to add metadata that won't be a coloring. As in "what data would you like to include for filtering in auspice". But would you prefer to add (If we do add |
I'd prefer to add For me, |
Since the new `--metadata-columns` flag controls what data is available in the final Auspice JSON, it should have a matching option in the Auspice config JSON setting. This also moves the flag to the "DISPLAY CONFIGURATION" group to match the other Auspice config controlled settings. As per the existing pattern, the command line flag overrules what is set in the config file. Based on feedback from @jameshadfield #1384 (comment)
1b83e78 to
32c9740
Compare
I'd vote to please keep the footer filters, if possible. I still use them regularly and find them helpful for quickly picking from a longer list of options (e.g., year-month values for flu, clades, regions, etc.). It's generally been helpful to have a way to see in a single view what the filter fields are and the specific values within each field. |
jameshadfield
left a comment
There was a problem hiding this comment.
I haven't re-reviewed everything but the changes in bdb2b2c look good by inspection
@huddlej No worries, this PR doesn't touch the footer filters! Although this reminded me that the default footer filters will likely only be a subset of available filter fields since it currently does not include the additional metadata columns! Lines 539 to 542 in 4bf72f8 |
|
Merging this PR as-is, will update the footer filters separately (if desired). |
Fixing changelog update made in #1384. The PR was actually released as part of v24.2.0, not v 24.1.0.
Description of proposed changes
Allows users to specify additional metadata columns to export as
node attributes that are not specified as coloring options via
the existing
--color-by-metadataoption or the Auspice config JSON.This allows the metadata columns to be visible in the tree in Auspice
without polluting the color-by options.
Related issue(s)
Resolves #1383
Checklist