Build Your First Deep Learning Model in TensorFlow

Build Your First Deep Learning Model in TensorFlow

This tutorial is for someone who is just starting off learning the TensorFlow package. In this tutorial, I will share a basic neural network in TensorFlow using the very popular MNIST dataset where the training features are the pixel values of the images of the digits and the target variables are the digits themselves. 

The MNIST dataset can be loaded from the TensorFlow library itself. 

import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

As you can see, it is already coming separated by training and testing set.

Let’s check the shape of the training features:

x_train.shape

Output:

(60000, 28, 28)

Checking one of the images using the pixels:

import matplotlib.pyplot as plt
plt.imshow(x_train[10])

Output:

So, the 10th datapoint we have in the training set represents the digit 3.

Here is the model for the classification. This is going to be a Sequential model. The first layer flattens the data as the data is two-dimensional.

The second layer is a Dense layer where I used 50 neurons and an activation function of “relu”. The number of neurons is a hyperparameter. Please feel free to try with a different number of neurons. And also with a different activation function. If you want to learn some more options and details about other popular activation functions, this is a tutorial for that:

The third layer is the final layer where we get 10 outputs for 10 classes. So, we have 10 neurons. The activation function for this layer is ‘softmax’ because ‘softmax’ function provides the decimal probabilities for each class. 

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape = (28, 28)),
tf.keras.layers.Dense(50, activation = "relu"),
tf.keras.layers.Dense(10, activation = "softmax")
])

We need to compile the model before the training. We need to define one loss function, an optimizer, and a metric to validate the model during training. I used Stochastic Gradient Descent as an optimizer here. 

 Here is the loss function for the training process:

loss_function = tf.keras.losses.SparseCategoricalCrossentropy()
model.compile(optimizer = 'SGD',
loss = loss_function,
metrics = ['accuracy']
)
model.fit(x_train, y_train, epochs = 5)

Output:

Epoch 1/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.6879 - accuracy: 0.8256
Epoch 2/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.3513 - accuracy: 0.9026
Epoch 3/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.3054 - accuracy: 0.9139
Epoch 4/5
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2776 - accuracy: 0.9212
Epoch 5/5
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2561 - accuracy: 0.9274

Model training is done. Here we evaluate the model using the test set:

model.evaluate(x_test, y_test)

Output:

[0.2409869134426117, 0.9318000078201294]

 On test data, we have 93% accuracy. And the loss is 0.24098.

Leave a Reply

Close Menu