How To Define A ReLU Layer In PyTorch

Video Transcript

Now that we know how to define a sequential container and a 2D convolutional layer, the next step is to learn how to define the activator layers that we will place between our convolutional layers.

For this, we want to import torch.nn as nn

import torch.nn as nn

And for a specific example, we will also want to import the random library for Pi.

import random

Two issues that can arise when optimizing a neural network are second order effects in activation functions and saturation of an activated unit.

Second order effects cause issues because linear functions are more easily optimized than their non-linear counterparts.

Saturation occurs when two conditions are satisfied: One, the activator function is asymptotically flat and two, the absolute value of the input to the unit is large enough to cause the output of the activator function to fall in the asymptotically flat region.

Since the gradient in the flat region is close to zero, it is unlikely that training via stochastic gradient descent will continue to update the parameters of the function in an appropriate way.

This often arises when using tanh and sigmoid activation functions.

A popular unit that avoids these two issues is the rectified linear unit or ReLU.

We use the activation function g(z) = max(0, z).

These units are linear almost everywhere which means they do not have second order effects and their derivative is 1 anywhere that the unit is activated.

Therefore, they avoid the issue of saturation.

In PyTorch, you can construct a ReLU layer using the simple function relu1 = nn.ReLU with the argument inplace=False.

relu1 = nn.ReLU(inplace=False)

Since the ReLU function is applied element-wise, there’s no need to specify input or output dimensions.

The argument inplace determines how the function treats the input.

If inplace is set to True, then the input is replaced by the output in memory.

This can reduce memory usage but may not be valid for your particular use case.

It is also discouraged in the PyTorch documentation.

If inplace is set to False, then both the input and the output are stored separately in memory.

Once we have defined our ReLU layer, all we need to do is place it between the convolutional layers in our sequential container.

So first, we will define the sequential container.

model = nn.Sequential()

Then we will define our first convolutional layer as done in the previous video.

first_conv_layer = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, stride=1, padding=1)

Then we will use add module to add the first convolutional layer

model.add_module("Conv1", first_conv_layer)

Then add module again to add our first rectified linear unit layer.

model.add_module("Relu1", relu1)

Alternatively, we can use add module and define the layer in the same line.

model.add_module("Conv2", nn.Conv2d(in_channels=16, out_channels=32, kernel_size=[3, 3], stride=[1, 1], padding=[1, 1]))

And with our second ReLU layer.

model.add_module("Relu2", nn.ReLU(inplace=False))

Now, all this model needs is an output layer.