fast and robust BED to GTF/GFF conversion in rust
docs .
usage .
install .
conda
- Converts BED to GTF or GFF/GFF3
- Supports
.bedand.bed.gzinputs - Supports
.gtf,.gtf.gz,.gff,.gff3,.gff.gz, and.gff3.gzoutputs - Uses memory-mapped I/O for uncompressed file inputs
- Uses rayon for parallel chunked conversion
- Reads from stdin when
--inputis omitted - Writes to stdout when
--outputis omitted - Accepts an optional transcript-to-gene mapping file through
--isoforms
bed2gtf [OPTIONS]
Options:
-i, --input <BED> Input BED path; reads stdin when omitted
-o, --output <OUTPUT> Output path; writes stdout when omitted
--to <gtf|gff> Output format for stdout; required when --output is absent
-I, --isoforms <TSV> Optional transcript-to-gene map
-t, --type <BED_TYPE> BED layout: 3, 4, 5, 6, 8, 9, 12 [default: 12]
-T, --threads <N> Worker threads [default: logical CPU count]
-c, --chunks <N> Parallel chunk size [default: 15000]
-g, --gz Gzip stdout or require a .gz output path
-L, --level <LEVEL> Log level: error, warn, info, debug, trace [default: info]
-h, --help Print help
-V, --version Print version
cargo install bed2gtf- get rust
- run
git clone https://github.com/alejandrogzi/bed2gtf.git && cd bed2gtf - run
cargo run --release -- -i <BED> -o <OUTPUT>
to build the development container image:
- run
git clone https://github.com/alejandrogzi/bed2gtf.git && cd bed2gtf - initialize docker with
start dockerorsystemctl start docker - build the image
docker image build --tag bed2gtf . - run
docker run --rm -v "[dir_where_your_files_are]:/dir" bed2gtf -i /dir/<BED> -o /dir/<OUTPUT>
to use bed2gtf through Conda just:
conda install bed2gtf -c biocondaorconda create -n bed2gtf -c bioconda bed2gtf
to use bed2gtf through Nextflow as a module just:
- borrow
main.nffrom here
- If
--outputis present, the output format is derived from its extension. - If
--outputis absent,--to gtfor--to gffis required. - If the input path is
.bed.gz, the output must also be gzip-compressed. - When writing to stdout, gzip output is enabled automatically for gzip input.
--gzis mainly useful for stdout. For file output, the path must end in.gzwhen gzip is requested.
--isoforms is optional.
When provided, it must point to a two-column tab-separated file:
ENST00000335137 ENSG00000186092
ENST00000423372 ENSG00000237613
- Column 1: transcript ID
- Column 2: gene ID
Blank lines and # comments are ignored.
Convert an uncompressed BED12 file to GTF:
bed2gtf --input transcripts.bed --output transcripts.gtfConvert a gzip-compressed BED file to gzip-compressed GFF3:
bed2gtf --input transcripts.bed.gz --output transcripts.gff3.gzWrite GTF to stdout:
bed2gtf --input transcripts.bed --to gtfRead from stdin and write GFF to stdout:
cat transcripts.bed | bed2gtf --to gffUse a transcript-to-gene mapping file:
bed2gtf \
--input transcripts.bed \
--output transcripts.gtf \
--isoforms isoforms.tsvEnable debug logging:
bed2gtf --input transcripts.bed --output transcripts.gtf --level debug