Skip to content

Update filtering and subsampling docs#222

Merged
victorlin merged 5 commits intomasterfrom
victorlin/update-subsampling-docs
Aug 20, 2024
Merged

Update filtering and subsampling docs#222
victorlin merged 5 commits intomasterfrom
victorlin/update-subsampling-docs

Conversation

@victorlin
Copy link
Copy Markdown
Member

@victorlin victorlin commented Aug 15, 2024

preview

Description of proposed changes

Prepping for upcoming weighted sampling docs. These changes are applicable to current augur filter features.

Related issue(s)

Checklist

  • Checks pass

@victorlin victorlin self-assigned this Aug 15, 2024
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 704f86d to b9ec0bb Compare August 15, 2024 22:18
@joverlee521 joverlee521 self-requested a review August 16, 2024 16:22
joverlee521
joverlee521 previously approved these changes Aug 16, 2024
Copy link
Copy Markdown
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement! I only left non-blocking suggestions, but I think all the additions are very helpful.

Comment thread src/guides/bioinformatics/filtering-and-subsampling.rst Outdated
Comment thread src/guides/bioinformatics/filtering-and-subsampling.rst Outdated
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 555a3e1 to c334b31 Compare August 16, 2024 21:34
@victorlin victorlin dismissed joverlee521’s stale review August 16, 2024 21:36

significant changes to "Filtering" section added

@victorlin victorlin requested a review from joverlee521 August 16, 2024 21:36
@victorlin victorlin changed the title Update subsampling docs Update filtering and subsampling docs Aug 16, 2024
@victorlin victorlin mentioned this pull request Aug 16, 2024
3 tasks
Reword some text and add an example for --query.
Copy link
Copy Markdown
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice additions -- the added "notes" are especially helpful

Comment thread src/guides/bioinformatics/filtering-and-subsampling.rst Outdated
Comment thread src/guides/bioinformatics/filtering-and-subsampling.rst Outdated
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 5962ddb to 65f8fe4 Compare August 19, 2024 21:54
@victorlin
Copy link
Copy Markdown
Member Author

After working on #223 and finding it hard to fit weighted sampling docs nicely, I've come back here and replaced c334b31 with 65f8fe4. This seems much better.

@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 65f8fe4 to b5cb803 Compare August 19, 2024 22:57
Comment on lines +11 to +16
Overview
========

``augur filter`` provides the flexibility to choose different subsets of input
data for various types of analysis. There are several options which can be
categorized based on the information source and selection method.
Copy link
Copy Markdown
Member Author

@victorlin victorlin Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new overview portion looks more like reference material that should go on the CLI page. But also, the rest of this page relies heavily on the terminology laid out here. Maybe fine to keep it here and leave the CLI page as a reference for what the options are.

Copy link
Copy Markdown
Member

@jameshadfield jameshadfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! Thanks for humouring my ideas

Comment thread src/guides/bioinformatics/filtering-and-subsampling.rst Outdated
Copy link
Copy Markdown
Contributor

@genehack genehack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some non-blocking suggestions.

Comment thread src/guides/bioinformatics/filtering-and-subsampling.rst
Comment thread src/guides/bioinformatics/filtering-and-subsampling.rst
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch 2 times, most recently from 96f68da to 62e01e5 Compare August 20, 2024 23:29
victorlin and others added 3 commits August 20, 2024 16:34
Note that I'm introducing new terminology here: "preliminary" vs.
"subsampling" vs. "force-inclusive" filtering options. These are clearly
distinct in the order of operations, making these labels helpful for
explaining that process.

For "preliminary", I had considered a term such as "exclusive" to better
contrast with "force-inclusive". However, the expression syntax used for
options in this category can be both exclusive (--exclude-where
region!=Asia) and inclusive (--min-date 2012). This is also why
"inclusive" is not a sufficient name for the "force-inclusive" category.

Co-authored-by: James Hadfield <hadfield.james@gmail.com>
Reword the subsampling introduction with *what* it is, followed by
examples on *why* paired with *how*.

This also allows future sampling methods such as weighted sampling to be
added by simply including a new section.
@victorlin victorlin force-pushed the victorlin/update-subsampling-docs branch from 62e01e5 to 5779a70 Compare August 20, 2024 23:35
@victorlin victorlin merged commit ab86278 into master Aug 20, 2024
@victorlin victorlin deleted the victorlin/update-subsampling-docs branch August 20, 2024 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants