DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data

a) Overview of our Distribution Alignment-based Language-Image Pre-Training (DALIP) method for biological data. Specifically, DALIP optimizes CLIP models by matching the similarity between feature distribution of image-text pairs, which are efficiently approximated by first- and second-order statistics of token features. Particularly, (b) a Multi-head Brownian Distance Covariance (MBDC) module is presented to efficiently acquire second-order statistics of token features.

Abstract

Recently, Contrastive Language-Image Pre-training (CLIP) has shown promising performance in domain-specific data (e.g., biology), and has attracted increasing research attention. Existing works generally focus on collecting extensive domain-specific data and directly tuning the original CLIP models. Intuitively, such a paradigm takes no full consideration of the characteristics lying in domain-specific data (e.g., fine-grained nature of biological data) and so limits model capability, while mostly losing the original ability of CLIP in the general domain. In this paper, we propose a Distribution Alignment-based Language-Image Pre-Training (DALIP) method for biological data. Specifically, DALIP optimizes CLIP models by matching the similarity between feature distribution of image-text pairs instead of the original [cls] token, which can capture rich yet effective information inherent in image-text pairs as powerful representations, and so better cope with fine-grained nature of biological data. Particularly, our DALIP efficiently approximates feature distribution via its first- and second-order statistics, while presenting a Multi-head Brownian Distance Covariance (MBDC) module to acquire second-order statistics of token features efficiently. Furthermore, we collect a new dataset for plant domain (e.g., specific data in biological domain) comprising 10M plant data with 3M general-domain data (namely PlantMix-13M) according to data mixing laws. Extensive experiments show that DALIP clearly outperforms existing CLIP counterparts in biological domain, while well generalizing to remote sensing and medical imaging domains. Besides, our PlantMix-13M dataset further boosts performance of DALIP in plant domain, while preserving model ability in general domain.

Installation

clone

git clone https://github.com/XavierHeart/DALIP
cd DALIP-main/

install dependencies

pip install -r requirments.txt

Dataset

The PlantMix-13M dataset will be publicly available soon. This dataset contains:

10M plant domain images and text pairs
3M general domain images and text pairs
Carefully curated according to data mixing laws

Model

Model	Arch.	Dataset	ImageNet-1K	Cifar-100	Cars	Pets	Sun397	General Mean	PlantNet	Fungi	PlantVillage	Med. Leaf	PlantDoc	Plant Mean	Mean
OpenCLIP	ViT-B/16	PlantMix-13M	46.8	61.3	50.2	66.2	50.1	54.9	89.9	47.0	32.3	48.9	33.0	50.2	52.6
DALIP	ViT-B/16	PlantMix-13M	49.2	69.2	58.9	75.2	55.6	61.6	91.0	52.8	34.5	43.7	34.3	51.3	56.4

Acknowledgments

This project is built upon the OpenCLIP codebase. We sincerely thank them for their outstanding contribution to the open-source community.

Name		Name	Last commit message	Last commit date
Latest commit History 612 Commits
docs		docs
scripts		scripts
src/open_clip		src/open_clip
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
HISTORY.md		HISTORY.md
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
dalip-abstract.png		dalip-abstract.png
framework.pdf		framework.pdf
framework.png		framework.png
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-test.txt		requirements-test.txt
requirements-training.txt		requirements-training.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data

Abstract

Installation

Dataset

Model

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 55

Uh oh!

Languages

XavierHeart/DALIP

Folders and files

Latest commit

History

Repository files navigation

DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data

Abstract

Installation

Dataset

Model

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 55

Uh oh!

Languages

Packages