GitHub - bcb-unl/run_dbcan: Run_dbcan V5

run_dbcan - Standalone Tool of dbCAN3

Announcement

⚠️ Important Notice:
Due to a recent cyberattack, our primary dbCAN web server is currently offline, and you will not be able to access the online database. Our IT team is actively working to resolve the issue. We apologize for any inconvenience this may cause.

In the meantime, you can still obtain the dbCAN database using our AWS S3 backup. Recommended methods:

1. Use the run_dbcan database command (recommended):

run_dbcan database --db_dir db --aws_s3

This command will download and organize the database files automatically.

2. Download via wget (not for folders):

Please note that wget cannot directly download an entire folder from an S3 bucket. It can only fetch individual files. To download all files, you will need to list the files and download them one by one or use AWS CLI. If you still want to download using wget, you must specify each file’s URL directly, for example:

wget https://dbcan.s3.us-west-2.amazonaws.com/db_v5-2_9-13-2025/some_file

If you want to download the entire folder, please use the AWS CLI as follows:

aws s3 cp s3://dbcan/db_v5-2_9-13-2025/ ./db --recursive

For more details on database downloads, please refer to our documentation.

If you have any questions or need help, feel free to open an issue.

Update

10/20/2025:

SignalP6.0 Topology Annotation: Added support for SignalP6.0 signal peptide prediction. Use --run_signalp flag in CAZyme_annotation command to enable topology annotation. Results are automatically added to the overview.tsv file.
Global Logging System: Implemented comprehensive logging system with --log-level, --log-file, and --verbose options for better debugging and monitoring.
Database Download Command: Added new database command for easy database downloading. Supports both HTTP and AWS S3 sources (use --aws_s3 flag for faster downloads). Use --cgc/--no-cgc to control CGC-related database downloads.
Code Structure Improvements: Continued refactoring with object-oriented programming, improved modularity, and centralized configuration management.

5/12/2025: dev-dbcan branch is used to test new functions and fix issues. After testing, this branch will be merged into the main branch and update docker/conda/pypi. If you want to use those beta functions, please replace the code folder (dbcan) with your current package.

3/16/2025:

Rewrite the structure of run_dbcan 4.0 (suggested by Haidong), using object-oriented programming (OOP) to improve maintainability and readability.
Added new function: cgc_circle, which can visualize CGC in genome.

Future plans Add prediction of food consumption through CAZyme. If you have new suggestions, please contact Dr. Yanbin Yin (yyin@unl.edu), Xinpeng Zhang (xzhang55@huskers.unl.edu), and Dr. Haidong Yi (hyi@stjude.org).

Introduction

Notice

This is the updated version of run_dbcan 4.0. Many changes have been made and described in https://run-dbcan.readthedocs.io/en/latest/. From now on, this repo is the official run_dbcan site, and the site at run_dbcan 4.0 will be no longer maintained.

run_dbcan is the standalone version of the dbCAN3 annotation tool for automated CAZyme annotation. This tool, known as run_dbcan, incorporates pyHMMER (replacing HMMER for better performance), Diamond, and dbCAN_sub for annotating CAZyme families, and integrates CAZyme Gene Clusters (CGCs) and substrate predictions.

Main Commands

The tool provides the following main commands:

database - Download dbCAN databases (supports HTTP and AWS S3)
CAZyme_annotation - Annotate CAZymes using Diamond, pyHMMER, and dbCAN-sub
gff_process - Generate GFF files for CGC identification
cgc_finder - Identify CAZyme Gene Clusters (CGCs)
substrate_prediction - Predict substrate specificities of CGCs
cgc_circle_plot - Generate circular plots for CGCs
easy_CGC - Complete CGC analysis pipeline (annotation + GFF processing + CGC identification)
easy_substrate - Complete CGC analysis with substrate prediction
Pfam_null_cgc - Annotate null genes in CGCs using Pfam

All commands support global logging options: --log-level, --log-file, and --verbose.

For usage discussions, visit our issue tracker. To learn more, read the dbcan doc. If you're interested in contributing, whether through issues or pull requests, please review our contribution guide.

Reference

Please cite the following dbCAN publications if you use run_dbcan in your research:

dbCAN3: automated carbohydrate-active enzyme and substrate annotation

Jinfang Zheng, Qiwei Ge, Yuchen Yan, Xinpeng Zhang, Le Huang, Yanbin Yin,

Nucleic Acids Research, 2023;, gkad328, doi: 10.1093/nar/gkad328.

dbCAN2: a meta server for automated carbohydrate-active enzyme annotation

Han Zhang, Tanner Yohe, Le Huang, Sarah Entwistle, Peizhi Wu, Zhenglu Yang, Peter K Busk, Ying Xu, Yanbin Yin

Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W95–W101, doi: 10.1093/nar/gky418.

dbCAN-seq: a database of carbohydrate-active enzyme (CAZyme) sequence and annotation

Le Huang, Han Zhang, Peizhi Wu, Sarah Entwistle, Xueqiong Li, Tanner Yohe, Haidong Yi, Zhenglu Yang, Yanbin Yin

Nucleic Acids Research, Volume 46, Issue D1, 4 January 2018, Pages D516–D521, doi: 10.1093/nar/gkx894*.

Name		Name	Last commit message	Last commit date
Latest commit History 208 Commits
.github		.github
dbcan		dbcan
docker		docker
docs		docs
envs		envs
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

run_dbcan - Standalone Tool of dbCAN3

Announcement

Update

Introduction

Main Commands

Reference

About

Uh oh!

Releases 16

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

run_dbcan - Standalone Tool of dbCAN3

Announcement

Update

Introduction

Main Commands

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages