Monsterl-ite

轻量化 MonSter 学生模型与训练/评测脚本，支持 PatchRefinerV2 对接、学生参数共享与教师 motion 蒸馏。

基于 MonSter (CVPR 2025)：Marry Monodepth to Stereo Unleashes Power。

文档与命令速查

COMMANDS.md：训练/评测/流水线命令、Hydra 可覆盖参数、PatchRefinerV2 与主支选择、学生轻量化与蒸馏新增参数（参数共享、教师 motion 传入）的完整说明与速查表。

训练配置文件（`config/`）

所有 Hydra 默认配置均在项目根目录下的 config/ 中，启动时用 --config-name <文件名不含 .yaml> 指定：

配置文件	入口脚本	说明
config/train_sceneflow.yaml	`train_sceneflow.py`	SceneFlow 主训练
config/train_kitti.yaml	`train_kitti.py`	KITTI 微调
config/train_eth3d.yaml	`train_eth3d.py`	ETH3D
config/train_middlebury.yaml	`train_middlebury.py`	Middlebury（MiddEval3 / 2021 等由 `middlebury_split` 等字段控制）

命令行可覆盖任意键，例如：train_middlebury.py middlebury_root=/path/to/data save_path=/path/to/out。

近期更新（Monsterl-ite）

学生轻量化（仅学生生效，可选）
- BasicMotionEncoder_mix2 权重复用：student_shared_motion_encoder: true 时，convd/convc/conv 一套权重复用 stereo/mono 两路。
- REMP conv1/conv2 双路共用：student_shared_remp_convs: true 时，REMP 前端一套 conv 复用于双路。
  默认均为 false，与原有行为一致；详见 COMMANDS.md 新增改动参数汇总。
教师 motion 传入学生更新块（蒸馏增强）
- distill_teacher_motion: true 时，训练中教师先 forward 并返回每步 motion 特征，学生更新块（GRU 输入）融合教师 motion 后再进 GRU；推理时不传，学生单独前向。
  可选；默认 false。详见 COMMANDS.md。
PatchRefinerV2 对接与主支选择
- Refiner/Coarse 使用 MonsterLiteRefiner / MonsterLiteCoarse，支持 use_student 与 student_*。
- 主支选择：coarse_use_stereo: true 为以双目为主，不配或 false 为以单目为主。
  详见 COMMANDS.md 零、新增改动参数汇总与速查。

安装与环境

NVIDIA GPU（建议 RTX 3090 或同等级）
Python 3.8+

conda create -n monster python=3.8
conda activate monster
pip install torch torchvision torchaudio  # 与 CUDA 版本匹配
pip install tqdm scipy opencv-python scikit-image tensorboard matplotlib
pip install timm mmcv accelerate hydra-core omegaconf
# 若使用 xformers 可另行安装

依赖与版本以 train_sceneflow.py 能正常运行为准；完整列表可参考原 MonSter 仓库。

数据与权重

训练：SceneFlow（FlyingThings3D / Monkaa / Driving），路径在 config/train_sceneflow.yaml 中 sceneflow_root、driving_root 指定。
教师权重：蒸馏时需 teacher_ckpt 指向预训练 MonSter（如 sceneflow.pth）。可从 MonSter 官方或 Hugging Face 获取。

训练（简要）

本仓库采用单阶段训练：一次运行 train_sceneflow.py 即完成学生模型训练（立体损失 + 可选蒸馏 + 可选深度损失/EPE 等），无需分多阶段或先训立体再单独训深度。

改配置：config/train_sceneflow.yaml 中至少设置 sceneflow_root、save_path、teacher_ckpt（若蒸馏）。
学生 + 蒸馏：默认已设 use_student: True、distill_enable: True；可选 student_shared_motion_encoder、student_shared_remp_convs、distill_teacher_motion（见上）。

启动：

CUDA_VISIBLE_DEVICES=0,1 accelerate launch --num_processes=2 --mixed_precision=bf16 train_sceneflow.py

覆盖参数：train_sceneflow.py total_step=5000 save_path=./out sceneflow_root=/path/to/sceneflow
流水线：bash run_full_pipeline.sh 会依次执行：一次上述训练 → 学生评测 → 教师评测（为「一次训练 + 两次评测」，非多阶段训练）。

更多命令与参数见 COMMANDS.md。

评测

# 学生模型（与训练时 student 配置一致）
python evaluate_stereo.py --restore_ckpt ./save_path/final.pth --dataset sceneflow --use_student --sceneflow_root /path/to/sceneflow/things

# 教师模型
python evaluate_stereo.py --restore_ckpt /path/to/teacher.pth --dataset sceneflow

评测不要使用 accelerate launch；老学生无多尺度 REMP 时可加 --no_use_remp_multiscale。详见 COMMANDS.md。

轻量学生（MonsterStudent）结构要点

core/monster_student.py：MonsterStudent，通过 student_encoder、student_hidden_dims、student_n_gru_layers 等缩小 encoder 与 GRU，与教师同结构、参数量约 1/10。
core/cost_agg_rts.py：可选 RTSPost3DAggregator（student_cost_agg: rts_post3d）。
蒸馏：输出蒸馏（init + 序列视差 L1）+ 可选教师 motion 注入更新块（distill_teacher_motion）；配置见 config/train_sceneflow.yaml 与 COMMANDS.md。

与 PatchRefinerV2 联合使用

Refiner：refiner.fine_branch.type: MonsterLiteRefiner，monster_lite_root 指向本仓库。
Coarse：coarse_branch.type: MonsterLiteCoarse，同样支持 use_student 与 student_*。
主支：coarse_use_stereo: true 为以双目为主，否则以单目为主。

详见 COMMANDS.md 零、新增改动参数汇总。

Citation

若使用 MonSter 或本仓库，请引用：

@InProceedings{Cheng_2025_CVPR,
    author    = {Cheng, Junda and Liu, Longliang and Xu, Gangwei and Wang, Xianqi and Zhang, Zhaoxing and Deng, Yong and Zang, Jinliang and Chen, Yurui and Cai, Zhipeng and Yang, Xin},
    title     = {MonSter: Marry Monodepth to Stereo Unleashes Power},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {6273-6282}
}

Acknowledgements

本仓库基于 MonSter、RAFT-Stereo 等相关工作。

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
Depth-Anything-V2-list3/depth_anything_v2		Depth-Anything-V2-list3/depth_anything_v2
config		config
core		core
docs		docs
media		media
scripts		scripts
tools		tools
.gitignore		.gitignore
COMMANDS.md		COMMANDS.md
LICENSE		LICENSE
README.md		README.md
WORK_CHECKLIST.md		WORK_CHECKLIST.md
demo_video.py		demo_video.py
evaluate_stereo.py		evaluate_stereo.py
monster_arch.tex		monster_arch.tex
run_full_pipeline.sh		run_full_pipeline.sh
save_disp.py		save_disp.py
save_pfm.py		save_pfm.py
save_pfm_eth.py		save_pfm_eth.py
train_eth3d.py		train_eth3d.py
train_kitti.py		train_kitti.py
train_middlebury.py		train_middlebury.py
train_sceneflow.py		train_sceneflow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monsterl-ite

文档与命令速查

训练配置文件（`config/`）

近期更新（Monsterl-ite）

安装与环境

数据与权重

训练（简要）

评测

轻量学生（MonsterStudent）结构要点

与 PatchRefinerV2 联合使用

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Monsterl-ite

文档与命令速查

训练配置文件（config/）

近期更新（Monsterl-ite）

安装与环境

数据与权重

训练（简要）

评测

轻量学生（MonsterStudent）结构要点

与 PatchRefinerV2 联合使用

Citation

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

训练配置文件（`config/`）

Packages