vit-pytorch

(work in progres) PyTorch implementation of ViT based on the ICLR2021 paper by Dosovitskiy et. al.

TODOs

  • Add patch embeddings
  • Add transformer encoder layer
  • Add transformer encoder (multiple layers)
  • Understand why repeating class token instead of setting the parameter with shape (bs, 1, embed_dim)
  • Attention dropout
  • Embedding dropout
  • MLP dropout (in encoder)
  • Add classification head
  • Complete ViT-Base
  • Make named layers to make torchvision compatible
  • Add training scripts

Install

pip install vit_pytorch

How to use

Load a config.yml file and pass to ViT module to modify architecture parameters.