Skip to content

Add GENBANK2FAA converter#353

Merged
cokelaer merged 2 commits intomainfrom
copilot/fix-gbk-to-faa-issue
Mar 6, 2026
Merged

Add GENBANK2FAA converter#353
cokelaer merged 2 commits intomainfrom
copilot/fix-gbk-to-faa-issue

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 6, 2026

GenBank files contain CDS features with /translation qualifiers holding protein sequences, but there was no converter to extract these into FAA (FASTA amino acid) format.

Changes

  • bioconvert/genbank2faa.py: New GENBANK2FAA converter with a biopython method that iterates CDS features, extracts /translation qualifiers, and writes FAA output with >protein_id product headers. CDS features without a translation qualifier are skipped.
  • test/data/genbank/test_genbank2faa.gbk: Minimal GenBank fixture with two CDS entries.
  • test/data/faa/test_genbank2faa.faa: Expected FAA output for the fixture.
  • test/test_genbank2faa.py: Parametrized test with MD5 comparison against expected output.

Usage

from bioconvert.genbank2faa import GENBANK2FAA

converter = GENBANK2FAA("genome.gbk", "proteins.faa")
converter()  # uses biopython method by default

Output format:

>AAA00001.1 thr operon leader peptide
MKRISTTITTTITITTGNGAG
>AAA00002.1 homoserine kinase
MVKVYAPASSANMSVGFDVLG
Original prompt

This section details on the original issue you should resolve

<issue_title>gbk to faa</issue_title>
<issue_description>https://www.biostars.org/p/151891/</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: cokelaer <778821+cokelaer@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix gbk to faa conversion issue Add GENBANK2FAA converter Mar 6, 2026
@cokelaer cokelaer marked this pull request as ready for review March 6, 2026 14:18
@cokelaer cokelaer merged commit 75cf7eb into main Mar 6, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gbk to faa

2 participants