Welcome to delt-hit! An end-to-end computational framework for DNA-encoded chemical library analysis.
This guide provides instructions for setting up delt-hit for both regular users and developers.
Before you begin, make sure you have the following installed:
We recommend using the Miniconda package manager to create an isolated environment for this project. This ensures that all dependencies are managed correctly.
- Download and install Miniconda for your operating system.
- After installation, you should be able to use the
condacommand in your terminal.
Some analysis features in delt-hit (like enrichment analysis with edgeR) depend on R.
- Install R: Download and install R from the Comprehensive R Archive Network (CRAN).
- Install R Packages: Once R is installed, open an R console and run the following commands to install the required
packages:
# Install tidyverse and GGally from CRAN install.packages(c("tidyverse", "GGally")) # Install BiocManager if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") # Install edgeR and limma from Bioconductor BiocManager::install(c("edgeR", "limma"))
This is the recommended way for most users.
-
Create and activate a Conda environment:
conda create -n delt-hit python=3.12 -y conda activate delt-hit
π‘ Always activate this environment (
conda activate delt-hit) before usingdelt-hit. -
Install
delt-hit: Install the package directly from GitHub usingpip:conda install pygraphviz -y pip install git+https://github.com/DELTechnology/delt-hit.git
Note: The
delt-hitpackage is under active development. To get the latest version ofcutadaptrequired by this package, please runpip install git+https://github.com/marcelm/cutadapt.git(this command can be ignored once Cutadapt 4.10 is released). -
Verify Installation: Check that the CLI is working:
delt-hit --help
You should see a list of available commands.
If you want to contribute to the development of delt-hit, follow these steps.
-
Configure SSH for GitHub: Make sure you have an SSH key added to your GitHub account to clone the repository.
-
Clone the Repository:
git clone git@github.com:DELTechnology/delt-hit.git cd delt-hit -
Create and activate the Conda environment:
conda create -n delt-dev python=3.12 -y conda activate delt-dev
-
Install in Editable Mode: Install the package with all development and testing dependencies:
pip install -e ".[dev,test]"π§ This "editable" install means that any changes you make to the source code will be immediately reflected when you run the
delt-hitcommand. -
(Optional) Install
pigzfor parallel processing: For faster demultiplexing on macOS, installpigzusing Homebrew:brew install pigz
Here is a typical workflow for using delt-hit:
-
Initialize Configuration: Create a
config.yamlfile from an Excel library file. This file defines the experiment, selections, and library information.delt-hit init --excel_path /path/to/library.xlsx
-
Run Demultiplexing: Run the entire demultiplexing pipeline based on your configuration. This includes preparing scripts, running
cutadapt, and processing the results.delt-hit demultiplex run --config_path /path/to/config.yaml
-
Define Analysis Groups: After demultiplexing, define analysis groups by editing your
config.yamlfile. Add ananalysessection to group selections for comparison. For example:experiments: - name: protein_vs_no_protein save_dir: experiments/template/analysis selections: - name: AG24_4 counts_path: experiments/template/selections/AG24_4/counts.txt group: no_protein - name: AG24_5 counts_path: experiments/template/selections/AG24_5/counts.txt group: no_protein - name: AG24_6 counts_path: experiments/template/selections/AG24_6/counts.txt group: no_protein - name: AG24_13 counts_path: experiments/template/selections/AG24_13/counts.txt group: protein - name: AG24_14 counts_path: experiments/template/selections/AG24_14/counts.txt group: protein - name: AG24_15 counts_path: experiments/template/selections/AG24_15/counts.txt group: protein
-
Calculate Enrichment: Calculate enrichment for the defined groups using different methods. The
--nameargument must correspond to a group you defined in yourconfig.yaml.# Using simple counts delt-hit analyse enrichment --config_path /path/to/config.yaml --name=protein_vs_no_protein --method=counts # Using edgeR for more sensitive statistical analysis delt-hit analyse enrichment --config_path /path/to/config.yaml --name=protein_vs_no_protein --method=edgeR
-
Work with the Library: Enumerate the library, compute properties, and generate representations.
# Enumerate all molecules in the library delt-hit library enumerate --config_path /path/to/config.yaml # Compute chemical properties delt-hit library properties --config_path /path/to/config.yaml # Generate molecular fingerprints (e.g., Morgan) delt-hit library represent --method=morgan --config_path /path/to/config.yaml
-
Launch Dashboard: Explore the results interactively in a web-based dashboard.
delt-hit dashboard \ --config_path /path/to/config.yaml \ --counts_path /path/to/selections/SELECTION_NAME/counts.txt
For a codebase overview and a detailed CLI reference, see:
The original protocol description lives in protocols.pdf.
For the most up-to-date CLI details and output locations, use the CLI guide.
Initializes a project by creating a config.yaml from a standardized Excel file.
delt-hit init --excel_path <path/to/library.xlsx>Commands for library enumeration, and chemical property and representation calculation.
enumerate: Generates the full library of molecules from the reaction steps defined in the configuration file.delt-hit library enumerate --config_path <path/to/config.yaml>
properties: Calculates a set of chemical properties for the enumerated library.delt-hit library properties --config_path <path/to/config.yaml>
represent: Generates molecular representations (fingerprints) for the library.delt-hit library represent --config_path <path/to/config.yaml> --method <METHOD>
<METHOD>can bemorganorbert.
Commands for demultiplexing FASTQ files and obtaining read counts.
run: Runs the entire demultiplexing workflow, including running Cutadapt and computing counts.delt-hit demultiplex run --config_path <path/to/config.yaml>
prepare: Prepares thecutadaptinput files and executable script without running them.delt-hit demultiplex prepare --config_path <path/to/config.yaml>
process: Computes counts from the output of acutadaptrun.delt-hit demultiplex process --config_path <path/to/config.yaml>
report: Generates a text report summarizing demultiplexing statistics.delt-hit demultiplex report --config_path <path/to/config.yaml>
qc: Generates quality control plots from the demultiplexing results.delt-hit demultiplex qc --config_path <path/to/config.yaml>
Commands for analyzing demultiplexed data, such as performing enrichment analysis.
enrichment: Performs enrichment analysis on an analysis group defined in the configuration file.delt-hit analyse enrichment --config_path <path/to/config.yaml> --name <group_name> --method <METHOD>
- Analysis groups must be defined under the
analyseskey in yourconfig.yaml. <group_name>refers to a key under theanalysessection.<METHOD>can becountsoredgeR.
- Analysis groups must be defined under the
Launches an interactive dashboard for data visualization.
dashboard: Starts a web-based dashboard to interactively explore counts data for a given selection.delt-hit dashboard --config_path <path/to/config.yaml> --counts_path <path/to/counts.txt>