Skip to content

DELTechnology/delt-hit

Repository files navigation

🧬 DELT-Hit

Welcome to delt-hit! An end-to-end computational framework for DNA-encoded chemical library analysis.

πŸš€ Installation

This guide provides instructions for setting up delt-hit for both regular users and developers.

Prerequisites

Before you begin, make sure you have the following installed:

1. Conda

We recommend using the Miniconda package manager to create an isolated environment for this project. This ensures that all dependencies are managed correctly.

2. R Environment

Some analysis features in delt-hit (like enrichment analysis with edgeR) depend on R.

  • Install R: Download and install R from the Comprehensive R Archive Network (CRAN).
  • Install R Packages: Once R is installed, open an R console and run the following commands to install the required packages:
    # Install tidyverse and GGally from CRAN
    install.packages(c("tidyverse", "GGally"))
    
    # Install BiocManager
    if (!require("BiocManager", quietly = TRUE))
        install.packages("BiocManager")
    
    # Install edgeR and limma from Bioconductor
    BiocManager::install(c("edgeR", "limma"))

πŸ§‘β€πŸ”¬ User Installation

This is the recommended way for most users.

  1. Create and activate a Conda environment:

    conda create -n delt-hit python=3.12 -y
    conda activate delt-hit

    πŸ’‘ Always activate this environment (conda activate delt-hit) before using delt-hit.

  2. Install delt-hit: Install the package directly from GitHub using pip:

    conda install pygraphviz -y
    pip install git+https://github.com/DELTechnology/delt-hit.git

    Note: The delt-hit package is under active development. To get the latest version of cutadapt required by this package, please run pip install git+https://github.com/marcelm/cutadapt.git (this command can be ignored once Cutadapt 4.10 is released).

  3. Verify Installation: Check that the CLI is working:

    delt-hit --help

    You should see a list of available commands.

πŸ‘©β€πŸ’» Developer Installation

If you want to contribute to the development of delt-hit, follow these steps.

  1. Configure SSH for GitHub: Make sure you have an SSH key added to your GitHub account to clone the repository.

  2. Clone the Repository:

    git clone git@github.com:DELTechnology/delt-hit.git
    cd delt-hit
  3. Create and activate the Conda environment:

    conda create -n delt-dev python=3.12 -y
    conda activate delt-dev
  4. Install in Editable Mode: Install the package with all development and testing dependencies:

    pip install -e ".[dev,test]"

    πŸ”§ This "editable" install means that any changes you make to the source code will be immediately reflected when you run the delt-hit command.

  5. (Optional) Install pigz for parallel processing: For faster demultiplexing on macOS, install pigz using Homebrew:

    brew install pigz

πŸ§ͺ Example Workflow

Here is a typical workflow for using delt-hit:

  1. Initialize Configuration: Create a config.yaml file from an Excel library file. This file defines the experiment, selections, and library information.

    delt-hit init --excel_path /path/to/library.xlsx
  2. Run Demultiplexing: Run the entire demultiplexing pipeline based on your configuration. This includes preparing scripts, running cutadapt, and processing the results.

    delt-hit demultiplex run --config_path /path/to/config.yaml
  3. Define Analysis Groups: After demultiplexing, define analysis groups by editing your config.yaml file. Add an analyses section to group selections for comparison. For example:

    experiments:
    - name: protein_vs_no_protein
      save_dir: experiments/template/analysis
      selections:
        - name: AG24_4
          counts_path: experiments/template/selections/AG24_4/counts.txt
          group: no_protein
        - name: AG24_5
          counts_path: experiments/template/selections/AG24_5/counts.txt
          group: no_protein
        - name: AG24_6
          counts_path: experiments/template/selections/AG24_6/counts.txt
          group: no_protein
        - name: AG24_13
          counts_path: experiments/template/selections/AG24_13/counts.txt
          group: protein
        - name: AG24_14
          counts_path: experiments/template/selections/AG24_14/counts.txt
          group: protein
        - name: AG24_15
          counts_path: experiments/template/selections/AG24_15/counts.txt
          group: protein
  4. Calculate Enrichment: Calculate enrichment for the defined groups using different methods. The --name argument must correspond to a group you defined in your config.yaml.

    # Using simple counts
    delt-hit analyse enrichment --config_path /path/to/config.yaml --name=protein_vs_no_protein --method=counts
    
    # Using edgeR for more sensitive statistical analysis
    delt-hit analyse enrichment --config_path /path/to/config.yaml --name=protein_vs_no_protein --method=edgeR
  5. Work with the Library: Enumerate the library, compute properties, and generate representations.

    # Enumerate all molecules in the library
    delt-hit library enumerate --config_path /path/to/config.yaml
    
    # Compute chemical properties
    delt-hit library properties --config_path /path/to/config.yaml
    
    # Generate molecular fingerprints (e.g., Morgan)
    delt-hit library represent --method=morgan --config_path /path/to/config.yaml
  6. Launch Dashboard: Explore the results interactively in a web-based dashboard.

    delt-hit dashboard \
    --config_path /path/to/config.yaml \
    --counts_path /path/to/selections/SELECTION_NAME/counts.txt

πŸ“š Documentation

For a codebase overview and a detailed CLI reference, see:

The original protocol description lives in protocols.pdf.

πŸ’» CLI Reference

For the most up-to-date CLI details and output locations, use the CLI guide.

init

Initializes a project by creating a config.yaml from a standardized Excel file.

delt-hit init --excel_path <path/to/library.xlsx>

library

Commands for library enumeration, and chemical property and representation calculation.

  • enumerate: Generates the full library of molecules from the reaction steps defined in the configuration file.
    delt-hit library enumerate --config_path <path/to/config.yaml>
  • properties: Calculates a set of chemical properties for the enumerated library.
    delt-hit library properties --config_path <path/to/config.yaml>
  • represent: Generates molecular representations (fingerprints) for the library.
    delt-hit library represent --config_path <path/to/config.yaml> --method <METHOD>
    • <METHOD> can be morgan or bert.

demultiplex

Commands for demultiplexing FASTQ files and obtaining read counts.

  • run: Runs the entire demultiplexing workflow, including running Cutadapt and computing counts.
    delt-hit demultiplex run --config_path <path/to/config.yaml>
  • prepare: Prepares the cutadapt input files and executable script without running them.
    delt-hit demultiplex prepare --config_path <path/to/config.yaml>
  • process: Computes counts from the output of a cutadapt run.
    delt-hit demultiplex process --config_path <path/to/config.yaml>
  • report: Generates a text report summarizing demultiplexing statistics.
    delt-hit demultiplex report --config_path <path/to/config.yaml>
  • qc: Generates quality control plots from the demultiplexing results.
    delt-hit demultiplex qc --config_path <path/to/config.yaml>

analyse

Commands for analyzing demultiplexed data, such as performing enrichment analysis.

  • enrichment: Performs enrichment analysis on an analysis group defined in the configuration file.
    delt-hit analyse enrichment --config_path <path/to/config.yaml> --name <group_name> --method <METHOD>
    • Analysis groups must be defined under the analyses key in your config.yaml.
    • <group_name> refers to a key under the analyses section.
    • <METHOD> can be counts or edgeR.

dashboard

Launches an interactive dashboard for data visualization.

  • dashboard: Starts a web-based dashboard to interactively explore counts data for a given selection.
    delt-hit dashboard --config_path <path/to/config.yaml> --counts_path <path/to/counts.txt>

About

Core functionalities to work with DECL libraries

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors