Skip to content

pasqualedem/DistillFSS

Repository files navigation

Paper License

DistillFSS: Synthesizing Few-Shot Knowledge into a Lightweight Segmentation Model
Official implementation of DistillFSS

DistillFSS Framework

πŸ”₯ Highlights

  • πŸš€ Efficient Inference: No support images needed at test timeβ€”knowledge is distilled directly into the model
  • 🎯 Strong Performance: Competitive or superior results compared to state-of-the-art CD-FSS methods
  • πŸ“Š Comprehensive Benchmark: New evaluation protocol spanning medical imaging, industrial inspection, and agriculture
  • ⚑ Scalable: Handles large support sets without computational explosion

πŸ“‹ Abstract

Cross-Domain Few-Shot Semantic Segmentation (CD-FSS) seeks to segment unknown classes in unseen domains using only a few annotated examples. This setting is inherently challenging: source and target domains exhibit substantial distribution shifts, label spaces are disjoint, and support images are scarceβ€”making standard episodic methods unreliable and computationally demanding at test time.

DistillFSS addresses these constraints through a teacher-student distillation process that embeds support-set knowledge directly into the model's parameters. By internalizing few-shot reasoning into a dedicated layer, our approach eliminates the need for support images during inference, enabling fast, lightweight deployment while maintaining the ability to adapt to novel classes through rapid specialization.

πŸ—οΈ Framework Overview

DistillFSS consists of two main components:

  1. Teacher Network: Processes the support set and encodes class-specific knowledge
  2. Student Network: Learns to segment without direct access to support images by distilling knowledge from the teacher

The distillation process embeds support-set information into the student's parameters, allowing efficient inference without episodic sampling.

πŸ“¦ Installation

# Clone the repository
git clone https://github.com/pasqualedem/DistillFSS.git
cd DistillFSS

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies and create virtual environment
uv sync

# Activate the environment
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

πŸ“Š Dataset Preparation

Our benchmark includes datasets from diverse domains. Follow the instructions below to download and prepare each dataset:

🌱 Agriculture Domain

WeedMap

mkdir -p data/WeedMap
cd data/WeedMap
# Download the zip from the official source
unzip 0_rotations_processed_003_test.zip

πŸ₯ Medical Imaging Domain

Nucleus Dataset

cd data
kaggle competitions download -c data-science-bowl-2018
unzip data-science-bowl-2018.zip -d data-science-bowl
unzip data-science-bowl/stage1_train.zip -d Nucleus

KVASIR (Gastrointestinal)

cd data
wget https://datasets.simula.no/downloads/kvasir-seg.zip
unzip kvasir-seg.zip

Lung Cancer

cd data
wget https://prod-dcd-datasets-cache-zipfiles.s3.eu-west-1.amazonaws.com/5rr22hgzwr-1.zip
unzip 5rr22hgzwr-1.zip
mv "lungcancer/Lung cancer segmentation dataset with Lung-RADS class/"* lungcancer
rm -r "lungcancer/Lung cancer segmentation dataset with Lung-RADS class/"

ISIC (Skin Lesions)

mkdir -p data/ISIC
cd data/ISIC
wget https://isic-challenge-data.s3.amazonaws.com/2019/ISIC_2019_Training_GroundTruth.csv
wget https://isic-challenge-data.s3.amazonaws.com/2018/ISIC2018_Task1-2_Training_Input.zip
wget https://isic-challenge-data.s3.amazonaws.com/2018/ISIC2018_Task1_Training_GroundTruth.zip
unzip ISIC2018_Task1-2_Training_Input.zip
unzip ISIC2018_Task1_Training_GroundTruth.zip

🏭 Industrial & Infrastructure Domain

Pothole Mix

Download from Mendeley Data

Industrial Defects

mkdir -p data/Industrial
cd data/Industrial
wget https://download.scidb.cn/download?fileId=6396c900bae2f1393c118ada -O data.zip
wget https://download.scidb.cn/download?fileId=6396c900bae2f1393c118ad9 -O data.json
unzip data.zip
mv data/* .
rm -r data

⬇️ Checkpoint Download

To facilitate benchmarking, pre-trained baseline model checkpoints can be downloaded using the provided script:

1. DCAMA-optimized checkpoints

Needed for DistillFSS experiments

bash scripts/download_dcama.sh

2. Other baseline checkpoints

Needed for comparison with other methods

bash scripts/download_baselines.sh

πŸš€ Getting Started

DistillFSS provides two main entry points for running grid search experiments:

1. Refinement/Distillation (refine.py)

Refinement (TransferFSS)

Fine-tune a pre-trained model on support examples for improved performance.

# Sequential execution
python refine.py grid --parameters parameters/refine/DATASET_NAME.yaml

# Parallel execution (creates SLURM scripts)
python refine.py grid --parameters parameters/refine/DATASET_NAME.yaml --parallel

# Only create SLURM scripts without running
python refine.py grid --parameters parameters/refine/DATASET_NAME.yaml --parallel --only_create

Distillation (DistillFSS)

Train a student model by distilling knowledge from a teacher network that processes support examples.

python distill.py grid --parameters parameters/distill/DATASET_NAME.yaml

The distillation process:

  • Creates a teacher-student architecture
  • Trains the student to mimic the teacher's outputs
  • Embeds support-set knowledge into the student's parameters
  • Evaluates on the test set after distillation

2. Speed Benchmarking

Evaluate the inference speed and efficiency of different models.

python distill.py grid --parameters parameters/speed.yaml

Configuration Files

The repository includes pre-configured parameter files organized by experiment type:

πŸ“Š Baseline Configurations (parameters/baselines/)

Standard baseline experiments for each dataset:

  • Industrial.yaml - Industrial defect segmentation
  • ISIC.yaml - Skin lesion segmentation
  • KVASIR.yaml - Gastrointestinal polyp segmentation
  • LungCancer.yaml - Lung nodule segmentation
  • Nucleus.yaml / Nucleus_hdmnet.yaml - Cell nucleus segmentation
  • Pothole.yaml - Road defect detection
  • WeedMap.yaml - Weed segmentation

πŸŽ“ Distillation Configurations (parameters/distill/)

Teacher-student distillation experiments:

  • Configurations for: Industrial, ISIC, KVASIR, LungCancer, Nucleus, Pothole, WeedMap

πŸ”§ Refinement Configurations (parameters/refine/)

Fine-tuning experiments on support sets:

  • Configurations for: Industrial, ISIC, KVASIR, LungCancer, Nucleus, Pothole, WeedMap, deepglobe

⚑ Speed Benchmark Configuration (parameters/speed.yaml)

Benchmarking inference speed across models and datasets.

Example Usage

# Run baseline experiments on Industrial dataset
python refine.py grid --parameters parameters/baselines/Industrial.yaml

# Run distillation on KVASIR dataset
python distill.py grid --parameters parameters/distill/KVASIR.yaml

# Run refinement on WeedMap with parallel execution
python refine.py grid --parameters parameters/refine/WeedMap.yaml --parallel

# Run efficiency benchmarks
python distill.py grid --parameters parameters/speed.yaml

# Run experiments on additional datasets
python refine.py grid --parameters parameters/other/EVICAN.yaml

πŸ“ˆ Results

DistillFSS achieves competitive or superior performance across multiple domains while significantly reducing computational costs:

Experimental Results

Performance comparison (mIoU) against state-of-the-art methods on Medical and Industrial datasets.

Dataset (Shot k) Low Shot
(k=5, 9, or 10)
High Shot
(k=50, 60, or 80)
BAM Transfer Distill BAM Transfer Distill
Lung Nodule (5/50) 0.17 3.43 3.31 0.19 2.51 4.87
ISIC (9/60) 9.67 14.35 13.31 8.69 22.15 23.41
KVASIR-Seg (5/50) 18.96 45.18 37.29 23.03 59.97 57.09
Nucleus (5/50) 11.03 73.12 69.57 11.05 79.39 79.96
WeedMap (5/50) 6.63 51.01 44.43 6.16 64.18 61.96
Pothole (5/50) 1.46 17.36 17.01 2.23 31.77 31.96
Industrial (10/80) 4.98 4.09 3.50 4.86 48.19 46.09

Detailed results and ablation studies are available in the paper.

πŸ”§ Project Structure

DistillFSS/
β”œβ”€β”€ distill.py              # Main distillation entry point
β”œβ”€β”€ refine.py               # Main refinement entry point
β”œβ”€β”€ configs/                # Configuration files
β”œβ”€β”€ distillfss/
β”‚   β”œβ”€β”€ data/              # Dataset implementations
β”‚   β”œβ”€β”€ models/            # Model architectures
β”‚   β”œβ”€β”€ utils/             # Utilities (logging, tracking, etc.)
β”‚   └── substitution.py    # Support set substitution strategies
β”œβ”€β”€ data/                  # Dataset storage
└── out/                   # Output directory (logs, models, results)

πŸ“Š Experiment Tracking

DistillFSS integrates with Weights & Biases for experiment tracking. Configure your W&B credentials before running:

wandb login

Training metrics, predictions, and model checkpoints are automatically logged to W&B.

πŸ“¦ Pre-trained Models

Access our collection of state-of-the-art checkpoints:

Dataset Number of Shots (k) HF Repo
WeedMap 5 HF

πŸ“š Citation

If you find this work useful for your research, please consider citing:

@misc{marinisDistillFSSSynthesizingFewShot2025,
	title = {{DistillFSS}: {Synthesizing} {Few}-{Shot} {Knowledge} into a {Lightweight} {Segmentation} {Model}},
	shorttitle = {{DistillFSS}},
	url = {http://arxiv.org/abs/2512.05613},
	doi = {10.48550/arXiv.2512.05613},
	publisher = {arXiv},
	author = {Marinis, Pasquale De and Blok, Pieter M. and Kaymak, Uzay and Brussee, Rogier and Vessio, Gennaro and Castellano, Giovanna},
	month = dec,
	year = {2025},
	note = {arXiv:2512.05613 [cs]},

πŸ™ Acknowledgements

This work builds upon several excellent open-source projects and datasets. We thank the authors for making their code and data publicly available.

πŸ“ License

This project is released under the MIT License. See LICENSE for details.

πŸ“§ Contact

For questions or collaborations, please contact:


Made with ❀️ for the Few-Shot Learning community

About

Efficient Adaption for CD-FSS through Knowledge Distillation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published