CreatiLayout

CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation
Hui Zhang, Dexiang Hong, Yitong Wang, Jie Shao, Xinglong Wu, Zuxuan Wu, and Yu-Gang Jiang
Fudan University & ByteDance Inc.

Introduction

CreatiLayout is a layout-to-image framework for Diffusion Transformer models, offering high-quality and fine-grained controllable generation.

LayoutSAM Dataset 📚: A large-scale layout dataset with 2.7 million image-text pairs and 10.7 million entities, featuring fine-grained annotations for open-set entities.

SiamLayout 🌟: A novel layout integration network for MM-DiT treats the layout as an independent modality with its own set of transformer parameters, allowing the layout to play an equally important role as the global description in guiding the image.

Layout Designer 🎨: A layout planner leveraging the power of large language models to convert various user inputs (e.g., center points, masks, scribbles) into standardized layouts.

🔥 News

2025-6-26: CreatiLayout was accepted by ICCV 2025 🎉🎉.
2025-3-10: We release CreatiLayout-FLUX, which empowers FLUX.1-dev for layout-to-image generation and achieves more precise rendering of spatial relationships and attributes.
2025-1-30: We propose CreatiLayout-LoRA, which achieves layout control with fewer additional parameters.

Quick Start

Setup

Environment setup

conda create -n creatilayout python=3.10 -y
conda activate creatilayout
conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidia

Requirements installation

pip install -r requirements.txt

Usage example

You can run the following code to generate an image:

python test_sample.py

Or you can try gradio at .

Dataset

LayoutSAM

The LayoutSAM dataset is a large-scale layout dataset derived from the SAM dataset, containing 2.7 million image-text pairs and 10.7 million entities. Each entity is annotated with a spatial position (i.e., bounding box) and a textual description. Traditional layout datasets often exhibit a closed-set and coarse-grained nature, which may limit the model's ability to generate complex attributes such as color, shape, and texture.

LayoutSAM-eval Benchmark

LayoutSAM-Eval is a comprehensive benchmark for evaluating the quality of Layout-to-Image (L2I) generation models. This benchmark assesses L2I generation quality from two perspectives: region-wise quality (spatial and attribute accuracy) and global-wise quality (visual quality and prompt following). It employs the VLM’s visual question answering to evaluate spatial and attribute adherence, and utilizes various metrics including IR score, Pick score, CLIP score, FID, and IS to evaluate global image quality.

To evaluate the model's layout-to-image generation capabilities through LayoutSAM-Eval, first you need to generate images for each data in the benchmark by running the following code:

python test_SiamLayout_sd3_layoutsam_benchmark.py

Then, visual language models (VLM) are used to answer visual questions. This will assess each image's adherence to spatial and attribute specifications. You can do this by using the following code:

python score_layoutsam_benchmark.py

Models

Layout-to-Image generation:

Model	Base model	Description
	Stable Diffusion 3	SiamLayout-SD3 used in the paper
	Stable Diffusion 3	SiamLayout-SD3-LoRA used in the paper
	FLUX.1-dev	SiamLayout-FLUX used in the paper

✒️ Citation

If you find our work useful for your research and applications, please kindly cite using this BibTeX:

@article{zhang2024creatilayout,
  title={CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation},
  author={Zhang, Hui and Hong, Dexiang and Gao, Tingwei and Wang, Yitong and Shao, Jie and Wu, Xinglong and Wu, Zuxuan and Jiang, Yu-Gang},
  journal={arXiv preprint arXiv:2412.03859},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets/figures		assets/figures
dataset		dataset
src		src
utils		utils
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
score_layoutsam_benchmark.py		score_layoutsam_benchmark.py
test_SiamLayout_flux_layoutsam_benchmark.py		test_SiamLayout_flux_layoutsam_benchmark.py
test_SiamLayout_sd3_layoutsam_benchmark.py		test_SiamLayout_sd3_layoutsam_benchmark.py
test_SiamLayout_sd3_lora_layoutsam_benchmark.py		test_SiamLayout_sd3_lora_layoutsam_benchmark.py
test_sample.py		test_sample.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CreatiLayout

Introduction

🔥 News

Quick Start

Setup

Usage example

Dataset

LayoutSAM

LayoutSAM-eval Benchmark

Models

✒️ Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

CreatiLayout

Introduction

🔥 News

Quick Start

Setup

Usage example

Dataset

LayoutSAM

LayoutSAM-eval Benchmark

Models

✒️ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages