🌞 TGVFM
Our Temporal-Guided Visual Foundation Models (TGVFM) introduce a unified framework that fuses event-based temporal information with pretrained Visual Foundation Models through a novel temporal context fusion block.
- Create a conda virtual env and install the dependencies following the
requirements.shfile.
bash requirements.sh
-
Download our prepared zip file: TGVFM-checkpoints.zip
-
Unzip the zip file in the
./checkpointsfolder in the root directory of this repo. -
checkpoints data folder structure should look like this:
checkpoints
├── Rein_checkpoints
│ ├── ViT-B
│ │ ├── ...
│ ├── ViT-S
│ │ ├── ...
│ ├── TGVFM-B_Seg.pth
│ ├── TGVFM-S_Seg.pth
-
Download our prepared zip file: dsec_dataset.zip
-
Unzip the zip file in the
./folder in the root directory of this repo. -
The DSEC dataset folder structure should look like this:
dsec_dataset
├── 62mask_gt_label_train_edges
│ ├── zurich_city_00_a
│ ├── ...
├── 62mask_test_edges
│ ├── zurich_city_13_a
│ ├── ...
├── train_semantic_segmentation
│ ├── zurich_city_00_a
│ ├── ...
├── test_semantic_labels
│ ├── zurich_city_13_a
│ ├── ...
- The
62mask_gt_label_train_edgesand62mask_test_edgesare reconstructed grayscale frames by our E2VID-B3.
Training (TGVFM-S):
python main.py --config-file configs/rein_distill.py --num-gpus 1 --bs 2 --lr 5e-6 --temporal_block 4 --sequences_num 5 --memory_length 3 --student_vit_type S --seg_train_w_gt --tag TGVFM-S
- Modify
--student_vit_type Sto--student_vit_type Bto train TGVFM-B.
Testing with our provided checkpoints:
- TGVFM-S
python main.py --config-file configs/rein_distill.py --num-gpus 1 --bs 2 --lr 5e-6 --temporal_block 4 --sequences_num 5 --memory_length 3 --student_vit_type S --seg_train_w_gt --tag TGVFM-S_Eval --eval-only --init_from ./checkpoints/TGVFM-S_Seg.pth
- TGVFM-B
python main.py --config-file configs/rein_distill.py --num-gpus 1 --bs 2 --lr 5e-6 --temporal_block 4 --sequences_num 5 --memory_length 3 --student_vit_type B --seg_train_w_gt --tag TGVFM-B_Eval --eval-only --init_from ./checkpoints/TGVFM-B_Seg.pth
- We have currently prepared only the code for supervised training of VGVFM on semantic segmentation. The distillation version and the related code for other tasks will be made public soon.