-
Notifications
You must be signed in to change notification settings - Fork 88
Description
Hi,
I'm having some trouble running Stringtie2 (v2.1.4) to assemble mapped isoseq subreads.
I first trim adapters.
lima subreads.bam adapters.fasta subreads.bam
Convert .bam to .fastq
samtools fastq -@ 30 -0 .subreads.fastq subreads.bam
The input .fastq file has 41,609,568 subreads with a mean length of 2,518.6 (range = 51 - 246,426). I have then mapped the reads to the reference genome with minimap2.
minimap2 -ax splice -t 30 -uf --secondary=no -C5 genome.fasta subreads.fastq | samtools view -b > subreads_mapped2genome.bam
samtools sort -@ 50 -o subreads_mapped2genome.srt.bam subreads_mapped2genome.bam
Then I run Stringtie2 with the -L option:
stringtie subreads_mapped2genome.srt.bam -p 50 -L -v -l stringtie-isoseq-GG -o isoseq_stringtie.gtf
It runs find and seems to produce the temporary output .gtf file but eventually it stops and returns the message:
GVec error: invalid count: -2006062490
I found the lines in the GVec.hh file where this error is printed but I'm afraid I can't really make sense of what this actually is trying to tell me.
Can you help me figure out what is going on?
Happy to provide more information if needed.
Also, it would be really great if some additional documentation/recommendations for running stringtie with long reads. At the moment there is hardly anything. E.g. do you recommend trimming adapters poly-A tails from subreads? if so what would you use? I have now trimmed adapters with lima but left the poly-A tails in.