TRKT:Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

By Zhu Xu, Ting Lei, Zhimin Li, Guan Wang, Qingchao Chen, Yuxin Peng, Yang Liu*

Accepted by ICCV2025

Installation

Following PLA/env.yaml to construct the virtual environment.

Dataset

Data preperation

For object detection results, we use pre-trained object detector VinVL, you can follow steps in our baseline PLA to generate them on your own, or directly use our pre-processed detection results in step 3 below.
For dataset, download from Action Genome.
Download necessary weakly-supervised annotation files and pre-trained weight (stored in Google Drive:Link or in Baidu Cloud:Link with password 1234)), the final data structure should be like

| -- data
     | -- action-genome
           | -- frames    
           | -- videos    
           | -- annotations 
           | -- AG_detection_results_refine 
| -- refine
      | -- output # pre-trained relation aware transformer weight
             |--checkpoint.pth
| -- PLA
      | -- models # pre-trained scene graphe generation weight
             |--model.tar
| -- RAFT

Evaluation

Object Detection Performance on Relation-Aware Transformer(TRKT) Model

cd refine
python scripts/evaluate.py # evaluate the performance of object detection

Model	AP@1	AP@10	AR@1	AR@10	Weight
PLA(baseline)	11.4	11.6	33.3	37.6	-
Ours	23.0	25.2	28.8	43.8	Google Drive: weight Baidu Cloud: weight password 1234

Scene Graph Generation Performance on DSGG(PLA) Model

cd PLA
python test.py --cfg configs/final.yml # for final scene graph generation performance evaluation

Model	W/R@10	W/R@20	W/R@50	N/R@10	N/R@20	N/R@50	weight
PLA(baseline)	14.32	20.42	25.43	14.78	21.72	30.87	-
Ours	17.56	22.33	27.45	18.76	24.49	33.92	Google Drive : weight Baidu Cloud: weight password 1234

Training

Step1. Optical Flow Extraction

We use RAFT to generate the optical flow in our data, you can either use our pre-processed optical flow (stored in Link) or generate them on you own by following steps:

cd RAFT   ## download the RAFT ckpt accordingly
python process_optical_flow.py
python post_process.py

Then place the generated optical flow file for train and test set under folder ~/data/action-genome/.

Step2. Relation-aware Refine Model Training

cd refine
python scripts/train.py

Step3. Scene Graph Generation Model Training

cd PLA
python train.py --cfg configs/oneframe.yml # after this line training, select the best oneframe ckpt as the model_path parameter in oneframe.yml for next line training
python train.py --cfg configs/final.yml # for video SGG model

Acknowledgement

We build our project upon PLA, RAFT, thanks for their works.

Citation

@misc{xu2025graph,
      title={Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced In-domain Knowledge Transferring}, 
      author={Zhu Xu and Ting Lei and Zhimin Li and Guan Wang and Qingchao Chen and Yuxin Peng and Yang Liu},
      year={2025},
      booktitle={ICCV},
      organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
PLA		PLA
RAFT		RAFT
assets		assets
refine		refine
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TRKT:Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Installation

Dataset

Data preperation

Evaluation

Object Detection Performance on Relation-Aware Transformer(TRKT) Model

Scene Graph Generation Performance on DSGG(PLA) Model

Training

Step1. Optical Flow Extraction

Step2. Relation-aware Refine Model Training

Step3. Scene Graph Generation Model Training

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

XZPKU/TRKT

Folders and files

Latest commit

History

Repository files navigation

TRKT:Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Installation

Dataset

Data preperation

Evaluation

Object Detection Performance on Relation-Aware Transformer(TRKT) Model

Scene Graph Generation Performance on DSGG(PLA) Model

Training

Step1. Optical Flow Extraction

Step2. Relation-aware Refine Model Training

Step3. Scene Graph Generation Model Training

Acknowledgement

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages