Skip to content

rmdup: dup-num-file not created if no duplicated reads #436

@fgvieira

Description

@fgvieira

Please check the items below before submitting an issue.
They help to improve the communication efficiency between us.
Thanks!

Prerequisites

  • Make sure you've installed the correct executable binary file.
    For Mac users, Please download
    • seqkit_darwin_amd64.tar.gz for Mac with Intel CPUs.
    • seqkit_darwin_arm64.tar.gz for Mac with M series CPUs.
  • Make sure you are using the latest version by seqkit version -u.
  • Read the usage and examples for the specific subcommand.

Describe your issue in detail

  • Please copy and paste the command you ran and the error information if reported.
  • It would be more helpful to provide as much information as you can:
    • Are you running on a personal computer or a server?
    • What's the operating system, and how much RAM (memory) is available?
    • Show the types and sizes of input files with file xxx and ls -lh xxx.
    • Show some lines of input files with head -n 5 xxx or zcat xxx.gz | head -n 5.
  • Provide a reproducible example.
    • Has this problem happened many times?
    • Or it only failed with this input file or/and these command/parameters.

I am running seqkit on a RedHat server:

seqkit rmdup --threads 10  --dup-num-file dup.tsv --ignore-case --by-seq  --out-file collapsed.fastq.gz collapsed.rmdup.fastq.gz

But seqkit rmdup does not create the dup-num-file (dup.tsv) file if there are no duplicated reads in the input file.

Input file is a FASTQ:

$ zcat collapsed.fastq.gz | head -n 8
@T0_RID60_S1_CM000682.2_ngsngs:13496936-13497014_length:79_mod0000 F2 R1 merged_79_0
TAAGGAAGCAGTGGAAAAAGAATAAATGCTGTAGATGAGGACAAGAAATTAGTTGAACTTTAATAAACTTCAAATGACT
+
CCCGGGGGG=GGGJJJGJGJJGJJJJJJGJCJJC=GJJJJJJGG1JGGGJJCGJJJG=JGGCGJCCJJGJJJGJGCCJG
@T2_RID60_S1_CM000666.2_ngsngs:130549431-130549518_length:88_mod0000 F3 R1 merged_88_0
TTTGCTCATATTTTGTGAAGTATTTTTATATCTGTATTCATGAATGATATTGCCATGCAATTGTCTTTTATTTTAATAATCTTGTCTT
+
CC8G=GGGGGGGGJJGJJJJJJJJJGCGGJJGJCJJJJJGJG8J1J=GJCJGJJJJJ(GGJGJGGJGGGGJJJJGJGGGCJJCJGGCJ

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions