Skip to content

filter: Ambiguous years should not be allowed when grouping by month #1071

@victorlin

Description

@victorlin

Current Behavior

Grouping by month allows ambiguous years such as XXXX-01-XX.

Expected behavior

Strains with ambiguous years should be dropped since month is supposed to represent the exact month in time YYYY-MM, not just the month number MM.

How to reproduce

cat >metadata.tsv <<~~
strain	date
SEQ1	1XXX-01-01
SEQ2	2XXX-01-01
SEQ3	3XXX-01-01
~~

augur filter \
   --metadata metadata.tsv \
   --group-by month \
   --sequences-per-group 1 \
   --subsample-seed 0 \
   --output-metadata out.tsv

cat out.tsv
# strain	date
# SEQ1	1XXX-01-01

Possible solution

Check for ambiguous years when grouping by month.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions