vit-pytorch
(work in progres) PyTorch implementation of ViT based on the ICLR2021 paper by Dosovitskiy et. al.
TODOs
- Add patch embeddings
- Add transformer encoder layer
- Add transformer encoder (multiple layers)
- Understand why repeating class token instead of setting the parameter with shape (bs, 1, embed_dim)
- Attention dropout
- Embedding dropout
- MLP dropout (in encoder)
- Add classification head
- Complete ViT-Base
- Make named layers to make torchvision compatible
- Add training scripts
Install
pip install vit_pytorch
How to use
Load a config.yml
file and pass to ViT
module to modify architecture parameters.