Skip to content

"OverflowError: FASTA/FASTQ record does not fit into buffer" when trimming ONT reads #783

@diego-rt

Description

@diego-rt

Hi @marcelm

I'm using cutadapt 4.4 with python 3.10.12 and I'm stumbling into this error when trimming the ultra long ULK114 adapters from a specific ONT Promethion flowcell. I'm wondering whether it is related to it having a few megabase size reads.

This is a description of the content of the file:

[diego.terrones@clip-login-1 6890b2ec397f656fd26681dc2d5e9b]$ seqkit stat -a reads.filtered.fq.gz 
file                  format  type  num_seqs        sum_len  min_len   avg_len    max_len      Q1      Q2      Q3  sum_gap     N50  Q20(%)  Q30(%)  GC(%)
reads.filtered.fq.gz  FASTQ   DNA    100,077  4,291,610,866    1,032  42,883.1  1,124,436  18,573  32,187  56,211        0  58,783   90.34   82.26   46.2

This is the command:

cutadapt --cores 4 -g GCTTGGGTGTTTAACCGTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTTCGTGCGCCGCTTCA --times 5 --error-rate 0.3 --overlap 30 -m 1000 -o trimmed.fq.gz reads.filtered.fq.gz

This is the output:

This is cutadapt 4.4 with Python 3.10.12
Command line parameters: --cores 4 -g GCTTGGGTGTTTAACCGTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTTCGTGCGCCGCTTCA --times 5 --error-rate 0.3 --overlap 30 -m 1000 -o trimmed.fq.gz reads.filtered.fq.gz
Processing single-end reads on 4 cores ...
ERROR: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 87, in run
    for index, chunks in enumerate(self._read_chunks(*files)):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 98, in _read_chunks
    for chunk in dnaio.read_chunks(files[0], self.buffer_size):
  File "/usr/local/lib/python3.10/dist-packages/dnaio/chunks.py", line 109, in read_chunks
    raise OverflowError("FASTA/FASTQ record does not fit into buffer")
OverflowError: FASTA/FASTQ record does not fit into buffer

ERROR: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 87, in run
    for index, chunks in enumerate(self._read_chunks(*files)):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 98, in _read_chunks
    for chunk in dnaio.read_chunks(files[0], self.buffer_size):
  File "/usr/local/lib/python3.10/dist-packages/dnaio/chunks.py", line 109, in read_chunks
    raise OverflowError("FASTA/FASTQ record does not fit into buffer")
OverflowError: FASTA/FASTQ record does not fit into buffer

ERROR: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 87, in run
    for index, chunks in enumerate(self._read_chunks(*files)):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 98, in _read_chunks
    for chunk in dnaio.read_chunks(files[0], self.buffer_size):
  File "/usr/local/lib/python3.10/dist-packages/dnaio/chunks.py", line 109, in read_chunks
    raise OverflowError("FASTA/FASTQ record does not fit into buffer")
OverflowError: FASTA/FASTQ record does not fit into buffer

ERROR: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 87, in run
    for index, chunks in enumerate(self._read_chunks(*files)):
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 98, in _read_chunks
    for chunk in dnaio.read_chunks(files[0], self.buffer_size):
  File "/usr/local/lib/python3.10/dist-packages/dnaio/chunks.py", line 109, in read_chunks
    raise OverflowError("FASTA/FASTQ record does not fit into buffer")
OverflowError: FASTA/FASTQ record does not fit into buffer

Traceback (most recent call last):
  File "/usr/local/bin/cutadapt", line 8, in <module>
    sys.exit(main_cli())
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/cli.py", line 1061, in main_cli
    main(sys.argv[1:])
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/cli.py", line 1131, in main
    stats = run_pipeline(
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 469, in run_pipeline
    statistics = runner.run()
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 350, in run
    chunk_index = self._try_receive(connection)
  File "/usr/local/lib/python3.10/dist-packages/cutadapt/runners.py", line 386, in _try_receive
    raise e
OverflowError: FASTA/FASTQ record does not fit into buffer

Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions