Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing

Jiyuan Wang^1,2,3 Chunyu Lin^1,✉ Lei Sun^2,✝ Zhi Cao¹ Yuyang Yin¹ Lang Nie⁴ Zhenlong Yuan² Xiangxiang Chu² Yunchao Wei¹ Kang Liao³ Guosheng Lin^3,✉

¹BJTU ²AMap, Alibaba Group ³NTU ⁴CQUPT
^✉Corresponding author ^✝Project leader

We propose RL3DEdit, a novel RL-based single-pass framework for 3D scene editing. Our core insight is that while generating multi-view consistent 3D content is highly challenging, verifying 3D consistency is tractable — naturally positioning reinforcement learning as a feasible solution. We leverage the 3D foundation model VGGT as a geometry-aware reward model and employ GRPO to effectively anchor the 2D editor's prior onto the 3D consistency manifold.

📢 News

[2026-03-11]: Code and model weights are coming soon. Stay tuned! 🚀
[2026-03-04]: Paper released on arXiv.

💡 Highlights

🏆 State-of-the-Art Performance: RL3DEdit achieves a VIEScore of 5.48 (vs. 3.23 for the strongest baseline), demonstrating superior editing fidelity and semantic alignment.
⚡ High Efficiency: Single-pass inference in just 1.5 minutes — over 2× faster than traditional pipelines and over 20× faster than other FLUX-based baselines.
🧠 Novel RL Paradigm: First work to introduce reinforcement learning into 3D scene editing, using VGGT as a geometry-aware reward model.

🛠️ Setup

Code is coming soon. This section will be updated once the code is released.

Clone the repository:

git clone https://github.com/AMAP-ML/RL3DEdit.git
cd RL3DEdit

Install dependencies:

conda create -n rl3dedit python=3.10 -y
conda activate rl3dedit
pip install -r requirements.txt  # Coming soon

🔥 Training

Prepare Training Data:

We collect 8 scenes from IN2N, BlendedMVS, and Mip-NeRF360 datasets, and construct 7–9 editing prompts per scene using a VLM, yielding 70 prompts in total.
Run Training Script:

# Training script will be released soon
accelerate launch train_grpo.py \
    --config configs/rl3dedit.yaml \
    --lora_rank 32 \
    --num_views 9 \
    --group_size 16 \
    --sde_noise 0.8

Training was conducted for one epoch on an NVIDIA RTX Pro 6000 GPU and took ~42 hours.

🕹️ Inference

Editing a 3D Scene

# Inference script will be released soon
python inference.py \
    --scene_path /path/to/your/scene \
    --instruction "your editing instruction" \
    --output_path /path/to/output

Evaluation on Test Set

# Evaluation script will be released soon
python evaluate.py --config configs/eval.yaml

Our test data includes 100 cases: novel views (70), unseen instructions (16), and new scenes (14).

🤗 Model Zoo

Model	Backbone	Training Data	Download
RL3DEdit	FLUX-Kontext-dev	70 prompts, 1319 samples	Coming Soon

🎓 Citation

If you find our work useful in your research, please consider citing our paper:

@article{wang2026geometry,
  title={Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing},
  author={Wang, Jiyuan and Lin, Chunyu and Sun, Lei and Cao, Zhi and Yin, Yuyang and Nie, Lang and Yuan, Zhenlong and Chu, Xiangxiang and Wei, Yunchao and Liao, Kang and others},
  journal={arXiv preprint arXiv:2603.03143},
  year={2026}
}

🙏 Acknowledgements

We thank the authors of FLUX-Kontext, VGGT, GRPO, and Flow-GRPO for their excellent work.

⭐ If you find this project useful, please give it a star! ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
page		page
.DS_Store		.DS_Store
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing

📢 News

💡 Highlights

🛠️ Setup

🔥 Training

🕹️ Inference

Editing a 3D Scene

Evaluation on Test Set

🤗 Model Zoo

🎓 Citation

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing

📢 News

💡 Highlights

🛠️ Setup

🔥 Training

🕹️ Inference

Editing a 3D Scene

Evaluation on Test Set

🤗 Model Zoo

🎓 Citation

🙏 Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages