= '../config.yml'
CONFIG_PATH = Path('../input') DATA_PATH
Transformer Encoder
Load parameters from the config file.
= yaml.safe_load(open(CONFIG_PATH)) config
= datasets.CIFAR10(DATA_PATH, download=True) dset
Files already downloaded and verified
= dset.data, dset.targets
images, targets len(images), len(targets)
(50000, 50000)
Prepare a small batch of images to test the image processing.
images.shape
(50000, 32, 32, 3)
Sample a bunch of points and select those as indices of the image for training.
= np.random.randint(low=0, high=len(images), size=3) image_idx
# corresponding labels
= [targets[t] for t in image_idx]
targets targets
[8, 3, 5]
= config["patch"]["in_ch"]
in_ch = config["patch"]["out_ch"] out_ch
# size of each small patch
= config['patch']['size']
patch_size patch_size
16
1:] images.shape[
(32, 32, 3)
= torch.Tensor(images[image_idx])
images = images/255.
images images.shape
torch.Size([3, 32, 32, 3])
Increase image size to match with ViT paper \(224\times 224\)
= config['data']['hw']
hw = T.Resize(hw)
augs augs
Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=warn)
= augs(images.permute(0, 3, 1, 2))
images images.shape
/Users/gg/.local/share/virtualenvs/vit-pytorch-u3xJdwPd/lib/python3.9/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).
warnings.warn(
torch.Size([3, 3, 224, 224])
Make Embedded Patches
= PatchEmbedding(config)(images)
patch_embed patch_embed.shape
torch.Size([3, 197, 768])
Prepare Transformer Layer
Apply LayerNorm over the embedding dimension, which in our cases is \(768\).
= config['patch']['n']
seq_len = config['patch']['out_ch']
embed_dim seq_len, embed_dim
(196, 768)
= nn.LayerNorm(normalized_shape=embed_dim)(patch_embed)
x_ln x_ln.shape
torch.Size([3, 197, 768])
= config['encoder']['msa_heads'] num_heads
= nn.MultiheadAttention(embed_dim=embed_dim, num_heads=num_heads)(x_ln, x_ln, x_ln) attn_output, attn_output_weights
attn_output.shape
torch.Size([3, 197, 768])
Prepare MSA block
MultiheadSelfAttn
MultiheadSelfAttn (config)
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
= MultiheadSelfAttn(config)(patch_embed)
x x.shape
torch.Size([3, 197, 768])
Prepare MLP block
MLPBlock
MLPBlock (config)
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
= MLPBlock(config)(x)
x x.shape
torch.Size([3, 197, 768])
Transformer Encoder
TransformerEncoderLayer
TransformerEncoderLayer (config)
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
= TransformerEncoderLayer(config)(patch_embed)
out out.shape
torch.Size([3, 197, 768])
Multilayered Transformer Encoder
TransformerEncoder
TransformerEncoder (config)
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
TransformerEncoder(config)(patch_embed).shape
torch.Size([3, 197, 768])