Skip to content

Latest commit

 

History

History

README.md

Less is More: Pay Less Attention in Vision Transformers

Training and evaluation code for LIT-Ti.

Training

First, activate your conda virtual environment.

conda activate lit

Make sure you have the correct ImageNet data_path in config/lit-ti.json.

To train LIT-Ti, run

bash scripts/train_lit.sh [GPUs]

Note: We use a total batch size of 1024 for all experiments on ImageNet. Therefore, you may want to use a different batch size by editing batch_size in config/lit-ti.json. For example, by setting batch_size to 64 and training with 8 GPUs, your total batch size is 512.

Evaluation

To evaluate LIT-Ti on ImageNet, run

bash scripts/eval_lit.sh [GPUs] [Checkpoint]

For example, to evaluate LIT-Ti with one GPU, you can run:

bash scripts/eval_lit.sh 1 checkpoint/lit_ti.pth

This should give

* Acc@1 81.124 Acc@5 95.544 loss 0.901
Accuracy of the network on the 50000 test images: 81.1%

Result could be slightly different based on you environment.

Results

Name Params (M) FLOPs (G) Top-1 Acc. (%) Model Log
LIT-Ti 19 3.6 81.1 google drive/github log