Skip to content

jemmyleee/HSSDCT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

HSSDCT: Factorized Spatial-Spectral Correlation for Hyperspectral Image Fusion

arXiv Python 3.8+ PyTorch 2.0+ License: MIT

πŸš€ News

  • [2026/01] πŸ”₯ Our paper has been accepted by ICASSP 2026!

Abstract

This repository contains the official PyTorch implementation of HSSDCT (Hierarchical Spatial-Spectral Dense Correlation Network).

HSSDCT introduces a novel framework for fusing low-resolution hyperspectral images (LR-HSI) with high-resolution multispectral images (HR-MSI). Unlike recent Transformer-based methods that suffer from quadratic complexity, our method proposes:

  • Hierarchical Dense-Residue Transformer Block (HDRTB): Progressively enlarges receptive fields with dense-residue connections for multi-scale feature aggregation.
  • Spatial-Spectral Correlation Layer (SSCL): Explicitly factorizes spatial and spectral dependencies, reducing self-attention to linear complexity while mitigating spectral redundancy.

Extensive experiments demonstrate that HSSDCT achieves state-of-the-art performance with significantly lower computational costs compared to recent methods like FusionMamba and QRCODE.

Table of Contents


Environment Setup

Prerequisites

  • Python 3.8 or higher
  • CUDA 12.x compatible GPU (recommended: NVIDIA GPU with β‰₯16GB memory)
  • Conda (recommended) or pip

Installation

Option 1: Using pip (Recommended)

# Clone the repository
git clone https://github.com/your-username/HSSDCT.git
cd HSSDCT

# Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Option 2: Using Conda

# Create conda environment
conda create -n hssdct python=3.10
conda activate hssdct

# Install PyTorch with CUDA support
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia

# Install remaining dependencies
pip install -r requirements.txt

Data Preparation

Dataset Structure

The dataset should be organized in the following structure under your data root directory:

<data_root>/
β”œβ”€β”€ train/
β”‚   β”œβ”€β”€ <scene_name>/
β”‚   β”‚   β”œβ”€β”€ GT.npz.npy          # Ground truth HRHSI (256Γ—256Γ—172), float32
β”‚   β”‚   β”œβ”€β”€ hrmsi.npz           # High-resolution MSI, contains keys: 'hrmsi4', 'hrmsi6'
β”‚   β”‚   └── lrhsi.npz           # Low-resolution HSI, contains keys: 'lrhsi', 'lrhsi1', 'lrhsi2', 'lrhsi3'
β”‚   β”œβ”€β”€ Hama9_15/
β”‚   β”œβ”€β”€ TBD4_39/
β”‚   └── ...
β”œβ”€β”€ val/
β”‚   β”œβ”€β”€ <scene_name>/
β”‚   β”‚   β”œβ”€β”€ GT.npz.npy
β”‚   β”‚   β”œβ”€β”€ hrmsi.npz
β”‚   β”‚   └── lrhsi.npz
β”‚   └── ...
└── test/
    β”œβ”€β”€ <scene_name>/
    β”‚   β”œβ”€β”€ GT.npz.npy
    β”‚   β”œβ”€β”€ hrmsi.npz
    β”‚   └── lrhsi.npz
    └── ...

Data File Specifications

File Format Shape Description
GT.npz.npy NumPy array (256, 256, 172) Ground truth high-resolution hyperspectral image
hrmsi.npz NumPy compressed (256, 256, 4/6) High-resolution multispectral image (4 or 6 bands)
lrhsi.npz NumPy compressed (64, 64, 172) Low-resolution hyperspectral image (4Γ— downsampled)

Usage

Training

To train the model from scratch:

python train.py \
    --root /path/to/your/data \
    --train_file ./data_path/train.txt \
    --val_file ./data_path/val.txt \
    --prefix EXPERIMENT_NAME \
    --batch_size 6 \
    --epochs 1000 \
    --lr 0.000055 \
    --lr_scheduler cosine \
    --msi_bands 4 \
    --bands 172 \
    --crop_size 128 \
    --image_size 256 \
    --network_mode 1 \
    --device cuda:0

Key Training Arguments

Argument Default Description
--root /ssd4t/Fusion_data Root directory of the dataset
--prefix ASTUDY_num2_BAND4_SWINY_SNR0 Experiment name for checkpoints and logs
--batch_size 6 Training batch size
--epochs 1000 Total training epochs
--lr 5.5e-5 Initial learning rate
--lr_scheduler cosine Learning rate scheduler (cosine or step)
--network_mode 1 Network mode: 0=Single, 1=LRHSI+HRMSI, 2=Triplet
--msi_bands 4 Number of HRMSI spectral bands (4 or 6)
--bands 172 Number of hyperspectral bands
--crop_size 128 Training patch size
--snr 0 Signal-to-noise ratio for AWGN (0=no noise)
--nf 96 Base number of feature channels
--gc 32 Growth channels in dense blocks
--joint_loss 1 Enable joint loss (L1 + SAM + BandWiseMSE)

Resume Training

python train.py \
    --resume_ind 100 \
    --resume_ckpt ./checkpoint/EXPERIMENT_NAME/best.pth \
    [other arguments...]

Project Structure

HSSDCT/
β”œβ”€β”€ train.py              # Main training script (training loop & validation)
β”œβ”€β”€ dataset.py            # Dataset classes for data loading
β”‚                         # - Pairwise (LRHSI + HRMSI) & Triplet loading
β”œβ”€β”€ trainOps.py           # Training utilities & Evaluation metrics
β”‚                         # - Losses: SAM Loss, BandWise MSE
β”‚                         # - Metrics: PSNR, ERGAS, RMSE
β”œβ”€β”€ utils.py              # General utilities (Activation, padding)
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ hssdct.py         # Main model architecture
β”‚   β”‚                     # - HSSDCT Framework
β”‚   β”‚                     # - HDRTB (Hierarchical Dense-Residue Transformer Block)
β”‚   β”‚                     # - SSCL (Spatial-Spectral Correlation Layer)
β”‚   └── module_util.py    # Weight initialization utilities
└── data_path/
    β”œβ”€β”€ train.txt         # Training sample list
    β”œβ”€β”€ val.txt           # Validation sample list
    └── test.txt          # Test sample list

Configuration Options

Network Architecture Parameters

Parameter Default Description
--nf 96 Base feature channels
--gc 32 Growth channels in dense blocks
--num_blocks 6 Number of repeated backbone blocks
--groups 1 Group convolution factor (1=full, 4=light)
--out_nc 172 Output channels (should match --bands)

Evaluation Metrics

The model is evaluated using the following metrics:

Metric Description Optimal
SAM Spectral Angle Mapper (degrees) Lower is better
PSNR Peak Signal-to-Noise Ratio (dB) Higher is better
RMSE Root Mean Square Error Lower is better
ERGAS Erreur Relative Globale Adimensionnelle de Synthèse Lower is better

Results

Quantitative comparison on the AVIRIS dataset (see Table 1 in the paper):

Method Params (M) FLOPs (G) PSNR (dB) SAM
FusionMamba 21.68 134.47 30.741 1.978
QRCODE 41.88 2231.19 35.361 1.623
HSSDCT (Ours) 6.78 283.84 37.212 1.348


Visual comparison of fused results (Figure 5 from the paper).


Citation

If you find this work useful in your research, please consider citing:

@inproceedings{lee2026hssdct,
  title     = {HSSDCT: Factorized Spatial-Spectral Correlation for Hyperspectral Image Fusion},
  author    = {Lee, Chia-Ming and Ho, Yu-Hao and Lin, Yu-Fan and Lee, Jen-Wei and Kang, Li-Wei and Hsu, Chih-Chung},
  booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year      = {2026}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.


Contact

For questions or issues, please open a GitHub issue or contact [jemmy112322@gmail.com].

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages