Conversation
standage
commented
Jan 15, 2020
Codecov Report
@@ Coverage Diff @@
## master #55 +/- ##
=====================================
Coverage 100% 100%
=====================================
Files 9 8 -1
Lines 247 209 -38
Branches 41 31 -10
=====================================
- Hits 247 209 -38
Continue to review full report at Codecov.
|
standage
commented
Feb 13, 2020
standage
pushed a commit
that referenced
this pull request
Sep 11, 2024
Previously the `usa` panel included 100 loci for which there was *supposed* to be allele frequency data for all 19 sub-populations in a mock population roughly matching demographics in the United States. Due to a bug in that code, some loci in the panel do not have allele frequency data available for all populations. Also, since the initial panel slated for evaluation on real data (`beta`) contains 50 loci, it makes sense to restrict the `usa` panel to 50 loci as well. This update fixes the bug and limits the panel to the top 50 loci ranked by A<sub>e</sub>. Related to #55.
standage
pushed a commit
that referenced
this pull request
Sep 11, 2024
This update fixes a bug in `notebook/usa8k/Snakefile` resulting in two self-comparisons in the **samepop** data set. Fortunately the error was not in the simulation parameters, but in the comparison code. Thus the scope of the changes was very small. See #55.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
MicroHapDB is intended, among other things, to enable design of panels that include microhaplotypes from disparate published sources. One obstacle to this, especially when it comes to interpretation, is the lack of frequencies for a unified set of population samples. This update uses 2,504 fully phased genomes from the 1000 Genomes Project to estimate microhaplotype frequencies across 26 global populations for all microhaplotypes defined in MicroHapDB.
1KGP-based frequencies published by ALFRED agree well with these new estimates—perfect agreement in many cases, only slight differences in most others. The differences are likely due to the use of PHASE by the ALFRED curators to statistically phase all of their aggregated microhap data. In this update, ALFRED's 1KGP-based frequency estimates are superceded by the frequency estimates obtained directly from the 1KGP phased haplotypes.
Frequency estimates could not be computed for 5 markers (
mh06PK-24844,mh11PK-63643,mh15PK-75170,mh22PK-104638, andmh0XUSC-XqD), since they are defined using variants that were not genotyped in the 1KGP Phase 3 data.Closes #42.