Transformer Encoder

Methods build a transformer encoder block
CONFIG_PATH = '../config.yml'
DATA_PATH = Path('../input')

Load parameters from the config file.

config = yaml.safe_load(open(CONFIG_PATH))
dset = datasets.CIFAR10(DATA_PATH, download=True)
Files already downloaded and verified
images, targets = dset.data, dset.targets
len(images), len(targets)
(50000, 50000)

Prepare a small batch of images to test the image processing.

images.shape
(50000, 32, 32, 3)

Sample a bunch of points and select those as indices of the image for training.

image_idx = np.random.randint(low=0, high=len(images), size=3)
# corresponding labels
targets = [targets[t] for t in image_idx]
targets
[8, 3, 5]
in_ch = config["patch"]["in_ch"]
out_ch = config["patch"]["out_ch"]
# size of each small patch
patch_size = config['patch']['size']
patch_size
16
images.shape[1:]
(32, 32, 3)
images = torch.Tensor(images[image_idx])
images = images/255.
images.shape
torch.Size([3, 32, 32, 3])

Increase image size to match with ViT paper \(224\times 224\)

hw = config['data']['hw']
augs = T.Resize(hw)
augs
Resize(size=[224, 224], interpolation=bilinear, max_size=None, antialias=warn)
images = augs(images.permute(0, 3, 1, 2))
images.shape
/Users/gg/.local/share/virtualenvs/vit-pytorch-u3xJdwPd/lib/python3.9/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).
  warnings.warn(
torch.Size([3, 3, 224, 224])

Make Embedded Patches

patch_embed = PatchEmbedding(config)(images)
patch_embed.shape
torch.Size([3, 197, 768])

Prepare Transformer Layer

Apply LayerNorm over the embedding dimension, which in our cases is \(768\).

seq_len = config['patch']['n']
embed_dim = config['patch']['out_ch']
seq_len, embed_dim
(196, 768)
x_ln = nn.LayerNorm(normalized_shape=embed_dim)(patch_embed)
x_ln.shape
torch.Size([3, 197, 768])
num_heads = config['encoder']['msa_heads']
attn_output, attn_output_weights = nn.MultiheadAttention(embed_dim=embed_dim, num_heads=num_heads)(x_ln, x_ln, x_ln)
attn_output.shape
torch.Size([3, 197, 768])

Prepare MSA block


source

MultiheadSelfAttn

 MultiheadSelfAttn (config)

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

x = MultiheadSelfAttn(config)(patch_embed)
x.shape
torch.Size([3, 197, 768])

Prepare MLP block


source

MLPBlock

 MLPBlock (config)

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

x = MLPBlock(config)(x)
x.shape
torch.Size([3, 197, 768])

Transformer Encoder


source

TransformerEncoderLayer

 TransformerEncoderLayer (config)

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

out = TransformerEncoderLayer(config)(patch_embed)
out.shape
torch.Size([3, 197, 768])

Multilayered Transformer Encoder


source

TransformerEncoder

 TransformerEncoder (config)

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

TransformerEncoder(config)(patch_embed).shape
torch.Size([3, 197, 768])