Skip to content

ShandaAI/AlayaRenderer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generative World Renderer

Project Page YouTube X Daily Paper arXiv

This repo contains the code, models, and dataset used in

Generative World Renderer

Zheng-Hui Huang, Zhixiang Wang, Jiaming Tan, Ruihan Yu, Yidan Zhang, Bo Zheng, Yu-Lun Liu, Yung-Yu Chuang, Kaipeng Zhang

hero_small.mp4

📢 Update

  • [2026.04.03] We have released our paper — discussions and feedback are warmly welcome!

🌐 Introduction

teaser

TL;DR We present a large-scale dataset and framework for high-quality inverse and forward rendering of videos using fine-tuned video diffusion models. We extract synchronized RGB videos from two AAA games and five aligned G-buffer channels, and propose a VLM-based evaluation protocol for real-world scenes. Our pipeline consists of two components:

  • Inverse Renderer (RGB → G-buffers): Fine-tuned from Cosmos-Transfer1-DiffusionRenderer to decompose RGB videos into G-buffer maps (albedo, normal, depth, roughness, metallic)
  • Game Editing (G-buffers + Text → Stylized RGB): Fine-tuned from Wan2.1 1.3B (via DiffSynth-Studio) to synthesize photorealistic RGB videos from G-buffer inputs with controllable lighting and style via text prompts

Key features of our dataset:

  • 4M+ frames at 720p / 30 FPS with 6 synchronized channels (RGB + albedo, normal, depth, metallic, roughness)
  • 40 hours of gameplay from 2 AAA games (Cyberpunk 2077 & Black Myth: Wukong)
  • Long-duration sequences: average 8 min per clip, up to 53 min continuous recording
  • Diverse content: urban/outdoor/indoor scenes, varying weather (sunny, rainy, foggy, night, sunset), and realistic motion patterns
  • Motion blur variant: offline-generated via sub-frame interpolation and linear-domain temporal averaging
  • VLM-based evaluation: reference-free assessment of material predictions using vision-language models

🚀 Usage

This repository contains the Inverse Renderer and Game Editing models. Please follow the instructions below to set up the environment and run inference for each model. We recommend creating separate conda environments for the two models to avoid version conflicts.

git clone --recurse-submodules https://github.com/ShandaAI/AlayaRenderer.git
cd AlayaRenderer

Model Weights

Model Base Model Link
Inverse Renderer Cosmos-Transfer1-DiffusionRenderer 7B HuggingFace
Game Editing Wan2.1 1.3B HuggingFace

Inverse Renderer

Our model is fine-tuned from Cosmos-Transfer1-DiffusionRenderer. Please follow the inverse_renderer/ instructions for environment setup and inference. Download the related weights and replace the checkpoint under inverse_renderer/checkpoints/Diffusion_Renderer_Inverse_Cosmos_7B with our fine-tuned checkpoint.

Game Editing

Installation

Please follow the DiffSynth-Studio instructions to set up the environment and download the related weights. Download our fine-tuned checkpoint from HuggingFace and place it under game_editing/models/train/Wan2.1-T2V-1.3B_gbuffer/.

Quick Example

cd game_editing

CUDA_VISIBLE_DEVICES=0 python \
    examples/wanvideo/model_inference/inference_gbuffer_caption.py \
    --checkpoint models/train/Wan2.1-T2V-1.3B_gbuffer/model.safetensors \
    --gpu 0 \
    --style snowy_winter \
    --prompt "the scene is set in a frozen, snow-covered environment under cold, pale winter light with falling snowflakes, creating a silent and ethereal winter wonderland atmosphere." \
    --gbuffer_dir test_dataset \
    --save_dir outputs/ \
    --num_frames 81 --height 480 --width 832

📋 TODO

  • Release dataset.
  • Release data curation toolkit.

❤️ Acknowledgements

This project builds upon the following excellent works:

📄 License

See LICENSE.

📝 Citation

If you find this project helpful, please consider citing:

@article{huang2026generativeworldrenderer,
    title={Generative World Renderer},
    author={Zheng-Hui Huang and Zhixiang Wang and Jiaming Tan and Ruihan Yu and Yidan Zhang and Bo Zheng and Yu-Lun Liu and Yung-Yu Chuang and Kaipeng Zhang},
    journal={arXiv preprint arXiv:2604.02329},
    year={2026}
}

About

Generative World Renderer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages