Buggy results:
reference
>ref
GCATGGCATACAAATTATTTCATTCCCATTGAGAAATAAAATCCAATTCTCCATCACCAAGAGAGCCTTCCGAAAGAGGCCCCCCTGGGCAAACGGCCACCGATGGAGAGGTCTGCCAGTCCTCTTCTACCCCACCCACGCCCCCACCCTAATCAGAGGCCAAACCCTTCCTGGAGCCTGTGATAAAAGCAACTGTTAGCTTGCACTAGACTAGCTTCAAAGTTGTATTGACCCTGGTGTGTTATGTCTAAGAGTAGATGCCATATCTCTTTTCTGG
read pairs
read1
@rq_182_1:0:0_2:0:0_0
CACAGGCTCCAGGAAGGGTTTGGCCTCTGAATAGGGTGGGGGCGTGGGTGGGGTAGAAGAGGACTGGCAGACCTCTCCATCGGTGGCCGTTTGCCCAGGGGGGCCTCTTTCGGAAGGCTCTCTTGTT
+
2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
read2
@rq_182_1:0:0_2:0:0_0
GAAATAAAATCCAATTCTCCAACACCAAGCCGAAAGAGGCCCCCCTGGGCAAACGAAACACATCGGCCACCGATGGAGAGGTCTGCCAGTCCTCTTCTACCCCACCCACGCCCCCACC
+
2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222
alignment result
rq_182_1:0:0_2:0:0_0 83 ref 56 60 127M = 33 -150 AACAAGAGAGCCTTCCGAAAGAGGCCCCCCTGGGCAAACGGCCACCGATGGAGAGGTCTGCCAGTCCTCTTCTACCCCACCCACGCCCCCACCCTATTCAGAGGCCAAACCCTTCCTGGAGCCTGTG 2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222 NM:i:2 MD:Z:1C94A30 MC:Z:29M8D24M10I55M AS:i:120 XS:i:0
rq_182_1:0:0_2:0:0_0 163 ref 33 60 29M8D24M10I55M = 56 150 GAAATAAAATCCAATTCTCCAACACCAAGCCGAAAGAGGCCCCCCTGGGCAAACGAAACACATCGGCCACCGATGGAGAGGTCTGCCAGTCCTCTTCTACCCCACCCACGCCCCCACC 2222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222222 NM:i:19 MD:Z:21T7^AGAGCCTT79 MC:Z:127M AS:i:73 XS:i:0
however samtools depth -q 10 -s generate depth not right at two position
ref 147 2
ref 148 2
these two positions are overlapped of course.
I checked the code in sam.c which tweak the quality of overlapped regions and found the bug as below:
each time a new cigar M or X or = is met, if there is another cigar which is none of "MX=", the cigar will be set to 0 first, and then enter the situation to process "MX=", in this case, it increase the positions (*iseq)++; (*icig)++; (*iref)++(https://github.com/samtools/htslib/blob/develop/sam.c#L4459), thus the first matched base in this kind of "MX=" cigar will never be processed in a right way which leads the variable "iref" in https://github.com/samtools/htslib/blob/develop/sam.c#4510 will not increase to the right amount, so finally the tweak loop in https://github.com/samtools/htslib/blob/develop/sam.c#L4500 will not iterate all the overlapped positions and left some end overlapped bases unprocessed, the number of unprocessed bases equals to number of those kinds of "MX=" cigar which appears after other kinds of cigar in overlapped region.
I have fixed this bug by #1196 , please consider merge it if I have done it properly.
Buggy results:
these two positions are overlapped of course.
I checked the code in sam.c which tweak the quality of overlapped regions and found the bug as below:
each time a new cigar M or X or = is met, if there is another cigar which is none of "MX=", the cigar will be set to 0 first, and then enter the situation to process "MX=", in this case, it increase the positions (*iseq)++; (*icig)++; (*iref)++(https://github.com/samtools/htslib/blob/develop/sam.c#L4459), thus the first matched base in this kind of "MX=" cigar will never be processed in a right way which leads the variable "iref" in https://github.com/samtools/htslib/blob/develop/sam.c#4510 will not increase to the right amount, so finally the tweak loop in https://github.com/samtools/htslib/blob/develop/sam.c#L4500 will not iterate all the overlapped positions and left some end overlapped bases unprocessed, the number of unprocessed bases equals to number of those kinds of "MX=" cigar which appears after other kinds of cigar in overlapped region.
I have fixed this bug by #1196 , please consider merge it if I have done it properly.