Create A One Layer Feed Forward Neural Network In TensorFlow With ReLU Activation

Video Transcript

For our network, we're going to create a simple one-layer network with a ReLU activation:

max(0, x)

The ReLU function is equal to the max of 0, or whatever value goes through it.

What that means in the context of a neural network is that we have some layer, with weights W and biases b, and the output of that layer is going to be equal to the max of 0 and then W times the input x, plus our bias.

max(0, W * x + b)

To define this in TensorFlow, we're going to have to define two variables.

One for our weight, which we'll just simply call "weights", and this is going to have a shape of 784 by 10, and we're going to initialize it using the Glorot Uniform Initializer (which is kind of mysterious at this point, but it’s the recommended way to initialize ReLU weights).

W = tf.get_variable("weights", shape=[784, 10],
                    initializer=tf.glorot_uniform_initializer())

If you want more information about that, there is an excellent book called Deep Learning by Goodfellow, et al.

Now that we have the weights, we are also going to add our bias.

b = tf.get_variable("bias", shape=[10],
                    initializer=tf.constant_initializer(0.1))

We are going to do the same thing, but this time, we're going to call it bias, and it's going to have a shape that’s equal to 10.

Now, we're going to initialize our bias to the constant value 0.1.

Just to give you an understanding of the context of the shapes, what's happening is that our x variable is going to have some number of instances, which each are represented at the 784 dimensional vector.

What that means, is that when we multiply x times W, the result is going to be a 10-dimensional vector, which we're then going to add our bias to that to get a 10-dimensional output.

The way that we will actually do that is using two functions.

First of all, the tf.matmul(x, W) function, which performs matrix multiplication on x and W, multiplying them together, and then by simply adding.

y = tf.nn.relu(tf.matmul(x, W) + b)

TensorFlow is smart enough to realize that if you’re adding two vectors, you don’t need to call a function to do that.

As a result, our output here will be the y vector, and that will be the prediction that TensorFlow learns.

So y is going to be the output of our simple neural network.

Full Source Code For Lesson

# create-simple-feedforward-network.py
#
# to run
# python numpy-arrays-to-tensorflow-tensors-and-back.py
#
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, shape=[None, 784])

W = tf.get_variable("weights", shape=[784, 10],
                    initializer=tf.glorot_uniform_initializer())

b = tf.get_variable("bias", shape=[10],
                    initializer=tf.constant_initializer(0.1))

y = tf.nn.relu(tf.matmul(x, W) + b)