Skip to content

Conversation

@biermanr
Copy link
Contributor

When using perbase base-depth with --keep-zeros I sometimes get missing and duplicate loci in the output, which I think is the same as issue #71.

I believe this small change to the process_regions function in base_depth.rs fixes the issue which occurs when there are no-depth loci skipped in the result:Vector<PileupPosition>.

The previous code created new_result from result assuming the following situation:

                                                      |End
    -------------xxxxxxxxxxxxxxxxx---------------------
    |
Start

where the x's are where the result vector recorded non-zero depth positions.

The prior code first fills in the "left" of x's, then extends new_result with the x's from result, then
fills in the "right" of the x's.

But the code produces incorrect output when the situation instead looks like:

                                                      |End
    -------------xxxxx-xxx-xx-xxxx---------------------
    |
Start

I've added two new tests cases to check_depths by creating two new rstest #[fixture] called non_mate_aware_keep_zeros_positions and mate_aware_keep_zeros_positions. I'm not sure if this was the best approach and I'm definitely open to suggestions.

Besides these two test cases I've also tested on a small .bam file wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00096/alignment/HG00096.chrom11.ILLUMINA.bwa.GBR.low_coverage.20120522.bam with the following command which was producing dups/skips before the code changes but now lists each locus once.

cargo build --release

echo "11    4702317 4704317" > test.bed #tab sep

./target/release/perbase base-depth \
    --threads 4 \ 
    --keep-zeros \
    --bed-file test.bed \
    --output counts.tsv \
    HG00096.chrom11.ILLUMINA.bwa.GBR.low_coverage.20120522.bam

@biermanr
Copy link
Contributor Author

I forgot to write that I really enjoy using this tool, thank you for making it!

@sstadick
Copy link
Owner

This looks fantastic! I'll try to take a look at it tomorrow and get it merged and released.

Thanks for taking the time to put this together!

@sstadick sstadick merged commit 0607422 into sstadick:master Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants