Skip to content

XZPKU/TRKT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TRKT:Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

arXiv

Video

By Zhu Xu, Ting Lei, Zhimin Li, Guan Wang, Qingchao Chen, Yuxin Peng, Yang Liu*

Accepted by ICCV2025

Installation

Following PLA/env.yaml to construct the virtual environment.

Dataset

Data preperation

  1. For object detection results, we use pre-trained object detector VinVL, you can follow steps in our baseline PLA to generate them on your own, or directly use our pre-processed detection results in step 3 below.

  2. For dataset, download from Action Genome.

  3. Download necessary weakly-supervised annotation files and pre-trained weight (stored in Google Drive:Link or in Baidu Cloud:Link with password 1234)), the final data structure should be like

| -- data
     | -- action-genome
           | -- frames    
           | -- videos    
           | -- annotations 
           | -- AG_detection_results_refine 
| -- refine
      | -- output # pre-trained relation aware transformer weight
             |--checkpoint.pth
| -- PLA
      | -- models # pre-trained scene graphe generation weight
             |--model.tar
| -- RAFT
      

Evaluation

Object Detection Performance on Relation-Aware Transformer(TRKT) Model

cd refine
python scripts/evaluate.py # evaluate the performance of object detection
Model AP@1 AP@10 AR@1 AR@10 Weight
PLA(baseline) 11.4 11.6 33.3 37.6 -
Ours 23.0 25.2 28.8 43.8 Google Drive: weight Baidu Cloud: weight password 1234

Scene Graph Generation Performance on DSGG(PLA) Model

cd PLA
python test.py --cfg configs/final.yml # for final scene graph generation performance evaluation
Model W/R@10 W/R@20 W/R@50 N/R@10 N/R@20 N/R@50 weight
PLA(baseline) 14.32 20.42 25.43 14.78 21.72 30.87 -
Ours 17.56 22.33 27.45 18.76 24.49 33.92 Google Drive : weight Baidu Cloud: weight password 1234

Training

Step1. Optical Flow Extraction

We use RAFT to generate the optical flow in our data, you can either use our pre-processed optical flow (stored in Link) or generate them on you own by following steps:

cd RAFT   ## download the RAFT ckpt accordingly
python process_optical_flow.py
python post_process.py

Then place the generated optical flow file for train and test set under folder ~/data/action-genome/.

Step2. Relation-aware Refine Model Training

cd refine
python scripts/train.py 

Step3. Scene Graph Generation Model Training

cd PLA
python train.py --cfg configs/oneframe.yml # after this line training, select the best oneframe ckpt as the model_path parameter in oneframe.yml for next line training
python train.py --cfg configs/final.yml # for video SGG model

Acknowledgement

We build our project upon PLA, RAFT, thanks for their works.

Citation

@misc{xu2025graph,
      title={Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced In-domain Knowledge Transferring}, 
      author={Zhu Xu and Ting Lei and Zhimin Li and Guan Wang and Qingchao Chen and Yuxin Peng and Yang Liu},
      year={2025},
      booktitle={ICCV},
      organization={IEEE}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •