This repo contains the supported code and configuration files to reproduce image classification results of LIT.
Download the ImageNet 2012 dataset from here, and prepare the dataset based on this script. The file structure should look like:
imagenet
├── train
│ ├── class1
│ │ ├── img1.jpeg
│ │ ├── img2.jpeg
│ │ └── ...
│ ├── class2
│ │ ├── img3.jpeg
│ │ └── ...
│ └── ...
└── val
├── class1
│ ├── img4.jpeg
│ ├── img5.jpeg
│ └── ...
├── class2
│ ├── img6.jpeg
│ └── ...
└── ...We provide baseline LIT models pretrained on ImageNet 2012.
| Name | Params (M) | FLOPs (G) | Top-1 Acc. (%) | Model | Log |
|---|---|---|---|---|---|
| LIT-Ti | 19 | 3.6 | 81.1 | google drive/github | log |
| LIT-S | 27 | 4.1 | 81.5 | google drive/github | log |
| LIT-M | 48 | 8.6 | 83.0 | google drive/github | log |
| LIT-B | 86 | 15.0 | 83.4 | google drive/github | log |
In our implementation, we have different training strategies for LIT-Ti and other LIT models. Therefore, we provide two codebases.
For LIT-Ti, please refer to code_for_lit_ti.
For LIT-S, LIT-M, LIT-B, please refer to code_for_lit_s_m_b.
This repository is released under the Apache 2.0 license as found in the LICENSE file.
This repository has adopted codes from DeiT, PVT and Swin, we thank the authors for their open-sourced code.