🌌 Object-Centric Representation Learning for 3D Semantic Scene Graph Prediction (NeurIPS 2025)

We propose a two-stage training framework for 3D Semantic Scene Graph Prediction. Our method focuses on learning discriminative object representations to improve predicate reasoning.
If you found our insights are usefule, please help to ⭐ it or recommend it to your friends. Thanks!

Authors: KunHo Heo*, GiHyeon Kim*, SuYeon Kim, MyeongAh Cho
*Equal Contribution

[Project Page] [Paper] [Arxiv]

💡 Core contribution of our work: object representation is key bottleneck of predicate estimation.

🚀 Environment Setup

Step 1) Conda environment setup

Run this script on your cloud/local server.

conda create -n vlsat python=3.8
conda activate vlsat
pip install -r requirement.txt
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.12.1+cu113.html
pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.12.1+cu113.html
pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.12.1+cu113.html
pip install torch-geometric==2.2.0
pip install git+https://github.com/openai/CLIP.git
pip install hydra
pip install hydra-core --upgrade --pre

Step 2) Download 3DSSG Dataset

1. Download 3RScan

First, download the 3RScan dataset. You can follow the instructions provided in the 3DSSG official guide.

2. Generate 2D Multi-view Images

Convert the point clouds into 2D images from multiple viewpoints. Make sure to update the internal path in the script to match your local environment.

# Modify the path in pointcloud2image.py to match your local environment.
python data/pointcloud2image.py

3. Directory Structure

Make sure your folders are organized as follows for proper operation:

data
  3DSSG_subset
    relations.txt
    classes.txt

  3RScan
    <scan_id_1>
      multi_view/
      labels.instances.align.annotated.v2.ply
    <scan_id_2>
    ...

Training & Evaluation

We suggests two stage training method: Object Feature Learning & Scene Graph Prediction.

[Object Encoder Ckpt] [Scene Generation Model Ckpt]

Stage 1) Object Feature Learning

See Here to check OFL document.

Stage 2) Scene Graph Prediction

Training scene graph prediction can be conducted by following script.

# Train
python -m main --mode train --config <config_path> --exp <exp_name>

# Evaluate
python -m main --mode eval --config <config_path> --exp <exp_name>

📚 Citation

If you find our work useful, please cite:

@article{heo2025object,
  title={Object-Centric Representation Learning for Enhanced 3D Scene Graph Prediction},
  author={Heo, KunHo and Kim, GiHyun and Kim, SuYeon and Cho, MyeongAh},
  journal={arXiv preprint arXiv:2510.04714},
  year={2025}
}

✨ Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT)(RS-2024-00456589) and Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. RS-2025-02263277 and RS-2022-00155911, Artificial Intelligence Convergence Innovation Human Resources Development (Kyung Hee University)).
Also, this project is inspired by and partially based on the following repositories:

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
clip_adapter		clip_adapter
config		config
data		data
data_processing		data_processing
img		img
src		src
train_obj_encoder		train_obj_encoder
utils		utils
README.md		README.md
main.py		main.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌌 Object-Centric Representation Learning for 3D Semantic Scene Graph Prediction (NeurIPS 2025)

🚀 Environment Setup

Step 1) Conda environment setup

Step 2) Download 3DSSG Dataset

1. Download 3RScan

2. Generate 2D Multi-view Images

3. Directory Structure

Training & Evaluation

Stage 1) Object Feature Learning

Stage 2) Scene Graph Prediction

📚 Citation

✨ Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Languages

VisualScienceLab-KHU/OCRL-3DSSG-Codes

Folders and files

Latest commit

History

Repository files navigation

🌌 Object-Centric Representation Learning for 3D Semantic Scene Graph Prediction (NeurIPS 2025)

🚀 Environment Setup

Step 1) Conda environment setup

Step 2) Download 3DSSG Dataset

1. Download 3RScan

2. Generate 2D Multi-view Images

3. Directory Structure

Training & Evaluation

Stage 1) Object Feature Learning

Stage 2) Scene Graph Prediction

📚 Citation

✨ Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages