Add Metrics Reporting To Improve Your TensorFlow Neural Network Model

Add Metrics Reporting to Improve Your TensorFlow Neural Network Model So You Can Monitor How Accuracy And Other Measures Evolve As You Change Your Model.

Add Metrics Reporting to Improve Your TensorFlow Neural Network Model So You Can Monitor How Accuracy And Other Measures Evolve As You Change Your Model.

Video Transcript


Today, we're going to discuss how to add accuracy reporting to an already existing neural network.

We're going to do that by starting with a simple neural network that we created in a previous screencast. [link to screencast]

# create-simple-feedforward-network.py
#
# to run
# python numpy-arrays-to-tensorflow-tensors-and-back.py
#
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, shape=[None, 784])

W = tf.get_variable("weights", shape=[784, 10],
                    initializer=tf.glorot_uniform_initializer())

b = tf.get_variable("bias", shape=[10],
                    initializer=tf.constant_initializer(0.1))

y = tf.nn.relu(tf.matmul(x, W) + b)

y_ = tf.placeholder(tf.float32, [None, 10])

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_)
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

for step in range(50):
    print(f"training step: {step}")
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_cs, y_:batch_ys})

Here, we have a neural network that is learning to classify images from the classical MNIST dataset by using a simple one-layer neural network with a ReLU activation and using gradient descent on cross-entropy.


What we're going to do is we're going to take this model which we can see that it trains here, and we're going to look at the accuracy of the model and we're going to play around with a few things to see what we can do to make it work better.


So here, let’s just go over quickly what we have.


So here, we have a network that has an input variable x and two weight variables, W and b for the weight and the bias, that run them through ReLU activation.


We have a variable y_ that will contain the true labels for each dataset, and we run that through cross-entropy to get our loss function.


Then for each step in our loop, we say that we're on that step. We take a batch of data from our dataset and we train our model on that.


What we want to do is we want to get an understanding of how accurate our model is.


So the first thing that we're going to do is we’re going to define a variable that tells us if our prediction is correct or not.

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))

We're going to do that by calling the tf.equal function on two instances of the tf.argmax function.

So what this is going to say, it’s going to ask us to find the value in our y and y_ tensors that has the highest value and then it’s going to compare if it is the same index for each tensor.

The output of this will then be a 1 or a 0 depending on whether or not both instances have the same value.


Once we have that, we're going to calculate accuracy by using the reduce mean function on the output of the correct prediction.

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

First, we need to cast our correct prediction to be a float as it’s currently, I believe, a boolean.

Once we have that, what that just means is calculate the average or mean over the entire correct prediction in TensorFlow.


Now that we have this, what we can do is we can calculate our accuracy at each step.

What we'll do here is every 10th step, print out the model accuracy.

if step % 10 == 0:
    print("model accuracy: ")
    print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                        y_: mnist.test.labels}))

So to do that, we're going to run our test dataset through our model using a feed_dict where here, the feed_dict is just going to consist of our entire test dataset.


Finally, once we're done, we're going to print out the accuracy of our final model.

print("final model accuracy: ")
print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                    y_: mnist.test.labels}))

When you're using your test dataset, you have to be careful to make sure that the only operations that you’re running don’t update your graph when they print out the results.

If we are instead running the train step and feeding these variables, our comparison would be inaccurate because we would be training the data on the test dataset which would make it impossible to accurately compare results.


Now, let's run our results and see what happens.

So currently, we have a learning rate of 0.001.

Let's start with a rate of 0.1 and see what happens.

train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy)


# Command line
python test-simple-network.py

What we can see is that we have a final accuracy of 0.1714 which is an improvement from the start.

The model starts out with 0.1097.

After 10 steps, it has an accuracy of 0.14, then 0.16, then 0.1752, and then the final result is 0.1714.

That’s not very good.

For an example, state-of-the-art models on MNIST will have accuracy that are well over 98%.


So let's try a few other things.

So the first move we can try is we can try going up to 100.

for step in range(100):

What happens if we train it longer?


# Command line
python test-simple-network.py

That’s weird. Our model accuracy actually goes down. What I think is happening here is overfitting.


Let's go back to where we were before.

for step in range(50):


Let's start playing with another variable.

What happens if we have a smaller learning rate?

train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)


# Command line
python test-simple-network.py

As you can see, we have an accuracy of 0.278 which is just under 0.3.


Let's see what happens if we continue to decrease our learning rate.

train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)


# Command line
python test-simple-network.py

Wow, that’s quite good!

As you can see, it’s very simple to add testing to your model and to get an understanding of how accurate it is and how that actually changes over time.



Full Source Code For Lesson

# create-simple-feedforward-network.py
#
# to run
# python numpy-arrays-to-tensorflow-tensors-and-back.py
#
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, shape=[None, 784])

W = tf.get_variable("weights", shape=[784, 10],
                    initializer=tf.glorot_uniform_initializer())

b = tf.get_variable("bias", shape=[10],
                    initializer=tf.constant_initializer(0.1))

y = tf.nn.relu(tf.matmul(x, W) + b)

y_ = tf.placeholder(tf.float32, [None, 10])

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_)
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

for step in range(50):
    print(f"training step: {step}")
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_cs, y_:batch_ys})
    if step % 10 == 0:
        print("model accuracy: ")
        print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                            y_: mnist.test.labels}))

print("final model accuracy: ")
print(sess.run(accuracy, feed_dict={x: mnist.test.images,
                                    y_: mnist.test.labels}))

Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.