Load The MNIST Data Set in TensorFlow So That It Is In One Hot Encoded Format

Import the MNIST data set from the TensorFlow Examples Tutorial Data Repository and encode it in one hot encoded format.

Import the MNIST data set from the TensorFlow Examples Tutorial Data Repository and encode it in one hot encoded format.

Video Transcript


We’ll begin by creating our file.

# command line
# e stands for emacs
#
e create-simple-feedforward-network.py

We’ll just call it simply create-simple-feedforward-network.py.


We’ll begin by importing TensorFlow as tf as a standard.

import tensorflow as tf


Then we’re going to import a helper function from the tensorflow.examples.tutorials.mnist called input_data.

from tensorflow.examples.tutorials.mnist import input_data

It helps us load our data.

Today, we’re going to be using the MNIST data set which consists of data showing images of different handwritten digits which are numbers from 0 through 9.


We’re going to access our data in this lesson by just using the input_data.read_data_sets("MNIST_data/", one_hot=True).

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

So what this does is it says download the data, save it to the MNIST_data folder, and process it so that data is in one hot encoded format.


One hot encoded format means that our data consists of a vector like this with nine entries.

[1 0 0 0 0 0]

This is not nine, obviously.

This is just an example.


Where this corresponds to 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 and so that our data will be labeled with a 1 corresponding to the column for that label and then 0 otherwise.

  1 2 3 4 5 6 7 8 9 0
[ 1 0 0 0 0 0 0 0 0 0 ]


The other way, if one hot was false, then our data would just have the y variable as 1 or 2 or 3, or whatever people do with this.

[ 1]
[ 2 ]
[ 3 ]


Fist thing we have to do is we have to create a placeholder variable.

x = tf.placeholder(tf.float32, shape=[None, 784])

What this does is this is how we have our data enter the TensorFlow graph.

So we do this by calling the tf.placeholder function.

The most important arguments here are the type which is tf.float32, indicating that we’re going to be using 32-bit floats to represent our data and that the shape is a tensor where the first dimension is unknown, which is why we say it’s none and that’s going to correspond to the number of examples that we have.

Then the second dimension is the size of each image, which in this case is 784.


We got 784 because 784 is equal to 28x28 and these are 28 by 28 pixel images.

# command line
# open the python interpreter
# ~ > python
#
28 * 28



Full Source Code For Lesson

# create-simple-feedforward-network.py
#
# to run
# python numpy-arrays-to-tensorflow-tensors-and-back.py
#
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, shape=[None, 784])

Receive the Data Science Weekly Newsletter every Thursday

Easy to unsubscribe at any time. Your e-mail address is safe.