GitHub - zhwzhong/DCNAS: Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution (TPAMI)

Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution

This repository provides the official implementation of our paper:

Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution
Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, and Shiqi Wang
(IEEE Transactions on Pattern Analysis and Machine Intelligence)

🔍 Abstract

Guided image super-resolution (GISR) aims to reconstruct a high-resolution (HR) target image from its low-resolution (LR) counterpart with the guidance of a HR image from another modality. Existing learning-based methods typically employ symmetric two-stream networks to extract features from both the guidance and target images, and then fuse these features at either an early or late stage through manually designed modules to facilitate joint inference. Despite significant performance, these methods still face several issues: i) the symmetric architectures treat images from different modalities equally, which may overlook the inherent differences between them; ii) lower-level features contain detailed information while higher-level features capture semantic structures. However, determining which layers should be fused and which fusion operations should be selected remain unresolved; iii) most methods achieve performance gains at the cost of increased computational complexity, so balancing the trade-off between computational complexity and model performance remains a critical issue. To address these issues, we propose a Dual-level Cross-modality Neural Architecture Search (DCNAS) framework to automatically design efficient GISR models. Specifically, we propose a dual-level search space that enables the NAS algorithm to identify effective architectures and optimal fusion strategies. Moreover, we propose a supernet training strategy that employs a pairwise ranking loss trained performance predictor to guide the supernet training process. To the best of our knowledge, this is the first attempt to introduce the NAS algorithm into GISR tasks. Extensive experiments demonstrate that the discovered model family, DCNAS-Tiny and DCNAS, achieve significant improvements on several GISR tasks, including guided depth map super-resolution, guided saliency map super-resolution, guided thermal image super-resolution, and pan-sharpening. Furthermore, we analyze the architectures searched by our method and provide some new insights for future research.

📊 Results

通过网盘分享的文件：DCNAS RESULT.zip 链接: https://pan.baidu.com/s/1UvE9AxcsJTM4w7AqRdoFSA?pwd=GISR 提取码: GISR

🔧 Dependencies

Python >= 3.7 (Recommend to use Anaconda or Miniconda)
[PyTorch >= 2.0 (https://pytorch.org/
NVIDIA GPU + CUDA

🔨 Installation

Clone repo

git https://github.com/zhwzhong/DCNAS.git
cd DCNAS

Install dependent packages
```
pip install -r requirements.txt
```

📦 Dataset

Guided Depth Map SR:

For this task, we use two widely used benchmark datasets: the NYU v2 dataset and the RGB-D-D dataset. The NYU v2 dataset is a large scale indoor dataset containing 1,449 RGB-D image pairs. We use the first 1,000 image pairs as the training set and the remaining 449 image pairs as the testing set. To verify the generalization ability of the proposed method, we further incorporate five additional datasets into our evaluation: 1) 1,064 RGB-D pairs from Sintel dataset; 2) the testset of DIDOE indoor dataset; 3) the first 500 RGB-D pairs from SUN RGBD testset; 4) the testset of RGB-D-D dataset; 5) the testset of DIML indoor dataset. 5) For RGB-D-D dataset, we use the official training and testing splits as the training and test sets. Download Link: 1. NYU 2. Sintel 3.DIDOE 4. SUN RGB 5. RGB-D-D 6. DIML
Guided Saliency Map SR:

For this task, we utilize the DUT-OMRON dataset as the testing set and employ bicubic downsampling with a scale factor of 8 to generate LR saliency maps.
Guided Thermal Image SR:

For this task, we use the training set provided by this work as our training set. Since the authors do not provide ground-truth images for their testing set, we employ the validation set as the testing set and randomly select 100 image pairs from the training set to serve as our validation set.
Pansharpening:

We employ the dataset provided by this work as the training and testing dataset.

🚀 Train

You can also train by yourself:

torchrun --nnodes 1 --nproc_per_node=4 --rdzv_backend=c10d --rdzv_endpoint=localhost:12343 main.py --train_supernet --model MODEL_NAME --dataset DATA_NAME --scale SCALE

Common options:

Argument	Description	Example
`--train_supernet`	supernet training	`--train_supernet`
`--search`	architecture search	`--search`
`--train_random`	train the search network	`--train_random`
`--num_blocks`	number of blocks for each stage	`--num_blocks 4`
`--num_stages`	number of down/up sampling stages	`--num_stages 4`
`--num_features`	feature channels	`--num_features 8`
`--model`	model name	`--model NAME`
`--scale`	Super-resolution upscale factor	`--scale=16`
`--batch_size`	Training batch size	`--batch_size=16`
`--lr`	Initial learning rate	`--lr=1e-4`
`--epochs`	Number of training epochs	`--epochs=200`
`--dataset`	dataset name	`--dataset=NYU`
`...`	...	`...`

🚀 Search

python main.py --search --model MODEL_NAME --dataset DATA_NAME

🚀 re-train

python main.py --train_random --model MODEL_NAME --dataset DATA_NAME

🚀 Test

We provide the pre-trained models in [Model Zoo] With the trained model, you can test your images.

python main.py --test_only --model MODEL_NAME --dataset DATA_NAME

通过网盘分享的文件：DCNAS RESULT.zip 实验结果提取码: GISR

⚙️ Acknowledgments

Thanks the editors and the reviewers for their insightful comments, which are very helpful to improve our paper!

📝 Citation

@ARTICLE{DCNAS,
  author={Zhong, Zhiwei and Liu, Xianming and Jiang, Junjun and Zhao, Debin and Wang, Shiqi},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution}, 
  year={2025},
  volume={47},
  number={9},
  pages={8249-8267},
  doi={10.1109/TPAMI.2025.3578468}}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.idea		.idea
config		config
data		data
flop_counter		flop_counter
gdata		gdata
logger		logger
lookup_tables		lookup_tables
loss		loss
models		models
predictor		predictor
scheduler		scheduler
utils		utils
.DS_Store		.DS_Store
README.md		README.md
dist.py		dist.py
estimator.py		estimator.py
main.py		main.py
network.png		network.png
requirements.txt		requirements.txt
res.png		res.png
search.py		search.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution

🔍 Abstract

📊 Results

🔧 Dependencies

🔨 Installation

📦 Dataset

🚀 Train

🚀 Search

🚀 re-train

🚀 Test

⚙️ Acknowledgments

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dual-Level Cross-Modality Neural Architecture Search for Guided Image Super-Resolution

🔍 Abstract

📊 Results

🔧 Dependencies

🔨 Installation

📦 Dataset

🚀 Train

🚀 Search

🚀 re-train

🚀 Test

⚙️ Acknowledgments

📝 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages