TensorFlow can be highly efficient, but the process of finding the right parameters can be very tiring and tedious. On the other hand, it won’t be as great without the right parameters. All of my previous tutorials on Tensorflow showed very good results. But I only presented my last version, where I found the best results. It took me a lot of trials to find the optimum parameters for good results.
Keras library has a nice tool called Keras Tuner that can be very helpful in finding the right hyperparameters. In this article, we will work on a project, develop a complete model and see how finding hyperparameters with the Keras Tuner works in the project.
The prerequisite for this tutorial is, I assume that you know how to work with the Keras and tensorflow already.
If you need help with learning about Tensorflow and Keras models, please feel free to check out some of my previous tutorials first. I provided the links at the end of this article.
Also, you need to install keras_tuner. I used a google colab notebook for this project and used this line of code for installation:
!pip install keras_tuner
Now let’s start the project. Here are the necessary imports:
import tensorflow as tf
from keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras import backend as K
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
from keras.optimizers import SGD
from keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
Now we will develop a few functions that will be used in the model. First, the model_build function. In the model build, a convolutional neural network of the Mini VGG network structure is used here. If you are not so familiar with this type of structure, please check this tutorial:
This is the model:
def model_build(hp):
model = Sequential()
inputShape = (32, 32, 3)
chanDim = -1
model.add(Conv2D(
hp.Int(“conv_1”, min_value=64, max_value = 128,
step=32), (3, 3), padding=‘same’, input_shape = inputShape))
model.add(Activation(“relu”))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(
hp.Int(“conv_2”, min_value=128, max_value = 256,
step=32), (3, 3), padding=‘same’, input_shape = inputShape))
model.add(Activation(“relu”))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(hp.Int(“dense_units1”, min_value=256, max_value = 786, step = 256)))
model.add(Dense(hp.Int(“dense_units2”, min_value = 256, max_value = 786, step = 256)))
model.add(Activation(“relu”))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation(“softmax”))
#lr = hp.Choice(“learning_rate”, values=[1e-1, 1e-2, 1e-3, 1e-4])
#opt = Adam(learning_rate=lr)
model.compile(loss=“sparse_categorical_crossentropy”,
metrics=[“accuracy”])
return model
Please notice, “conv_1”, “conv_2″, “dense_units1”, and “dense_units2” are placeholders. We will find out the best values for them using Keras Tuner and finish developing the model.
I would also like to have a plot function to plot the results of the training:
def plot_h(H):
plt.style.use("ggplot")
plt.figure(figsize = (6, 5))
plt.plot(H.history["loss"], label="train_loss")
plt.plot(H.history["val_loss"], label="val_loss")
plt.plot(H.history["accuracy"], label="train_acc")
plt.plot(H.history["val_accuracy"], label="val_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend()
We will use the famous public dataset cifar10 for this exercise. Loading the dataset from TensorFlow itself using the load_data method where training and testing data are already separated for us.
((train_x, train_y), (test_x, test_y)) = cifar10.load_data()
Here the labels train_y and test_y need to be transformed. Using the LabelBinarizer method from the sklearn library to transform train_y and test_y.
from sklearn import preprocessing
lb = preprocessing.LabelBinarizer()
train_y_lb = lb.fit_transform(train_y)
test_y_lb = lb.transform(test_y)
The data is ready!
Now we will work with our tuner to find out the best hyperparameters. From the Keras Tuner, the Hyperband method is used today. Maybe we will work on some other tuners in some future tutorials.
Hyperband takes the model_build function. We will use max_epochs of 10 for this exercise.
from tensorflow.keras.optimizers import Adam
from keras_tuner.tuners import Hyperband
tuner = Hyperband(
model_build,
objective='val_accuracy',
max_epochs=10)
Tuner.search method for the search:
tuner.search(
x = train_x, y = train_y,
validation_data = (test_x, test_y),
batch_size = 64,
epochs = 30)
Output:
Trial 30 Complete [00h 02m 24s]
val_accuracy: 0.6922000050544739
Best val_accuracy So Far: 0.7228000164031982
Total elapsed time: 00h 26m 19s
Hyperparameter search is done. Here we need to access the hyperparameter using the get_best_hyperparameters() method. Unpacking the hyperparameters:
hp_best = tuner.get_best_hyperparameters(num_trials=1)[0]
print("conv_1 layer: {}".format(
hp_best.get("conv_1")
))
print("conv_2 layer: {}".format(
hp_best.get('conv_2')))
print("dense layer1: {}".format(
hp_best.get('dense_units1')))
print("dense layer1: {}".format(
hp_best.get('dense_units2')))
print("learning rate: {}".format(
hp_best.get('learning_rate')))
Output:
conv_1 layer: 96
conv_2 layer: 192
dense layer1: 768
dense layer1: 768
learning rate: 0.001
We got all our hyperparameters. Now, here we run our model using our hyperparameters:
model = tuner.hypermodel.build(hp_best)
opt = Adam(learning_rate=0.001)
H = model.fit(x = train_x, y = train_y,
validation_data = (test_x, test_y), batch_size = 64,
epochs = 50, verbose=1)
Part of the output.
Output:
Epoch 22/50
782/782 [==============================] - 9s 12ms/step - loss: 0.1025 - accuracy: 0.9665 - val_loss: 1.6440 - val_accuracy: 0.7347
Epoch 23/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0981 - accuracy: 0.9670 - val_loss: 1.7443 - val_accuracy: 0.7118
Epoch 24/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0921 - accuracy: 0.9691 - val_loss: 1.4908 - val_accuracy: 0.7359
Epoch 25/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0868 - accuracy: 0.9710 - val_loss: 1.8901 - val_accuracy: 0.6545
Epoch 26/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0880 - accuracy: 0.9705 - val_loss: 1.8133 - val_accuracy: 0.7088
Epoch 27/50
782/782 [==============================] - 10s 13ms/step - loss: 0.0813 - accuracy: 0.9729 - val_loss: 2.6010 - val_accuracy: 0.6120
Epoch 28/50
782/782 [==============================] - 10s 12ms/step - loss: 0.0823 - accuracy: 0.9721 - val_loss: 1.9144 - val_accuracy: 0.6998
Epoch 29/50
782/782 [==============================] - 10s 12ms/step - loss: 0.0811 - accuracy: 0.9738 - val_loss: 2.2381 - val_accuracy: 0.6807
Epoch 30/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0728 - accuracy: 0.9762 - val_loss: 1.9062 - val_accuracy: 0.7155
Epoch 31/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0681 - accuracy: 0.9770 - val_loss: 1.8808 - val_accuracy: 0.7238
Epoch 32/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0665 - accuracy: 0.9782 - val_loss: 3.4956 - val_accuracy: 0.6274
Epoch 33/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0673 - accuracy: 0.9785 - val_loss: 1.9554 - val_accuracy: 0.7200
Epoch 34/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0611 - accuracy: 0.9805 - val_loss: 2.0881 - val_accuracy: 0.6907
Epoch 35/50
782/782 [==============================] - 10s 13ms/step - loss: 0.0616 - accuracy: 0.9800 - val_loss: 2.7374 - val_accuracy: 0.6635
Epoch 36/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0609 - accuracy: 0.9800 - val_loss: 1.9722 - val_accuracy: 0.7279
Epoch 37/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0627 - accuracy: 0.9807 - val_loss: 2.0683 - val_accuracy: 0.7128
Epoch 38/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0533 - accuracy: 0.9819 - val_loss: 2.2291 - val_accuracy: 0.7129
Epoch 39/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0555 - accuracy: 0.9827 - val_loss: 2.4901 - val_accuracy: 0.7018
Epoch 40/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0519 - accuracy: 0.9833 - val_loss: 2.1977 - val_accuracy: 0.7231
Epoch 41/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0505 - accuracy: 0.9833 - val_loss: 2.0045 - val_accuracy: 0.7239
Epoch 42/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0504 - accuracy: 0.9843 - val_loss: 3.1002 - val_accuracy: 0.6642
Epoch 43/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0511 - accuracy: 0.9840 - val_loss: 2.2534 - val_accuracy: 0.7169
Epoch 44/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0504 - accuracy: 0.9846 - val_loss: 2.2480 - val_accuracy: 0.7225
Epoch 45/50
782/782 [==============================] - 9s 11ms/step - loss: 0.0476 - accuracy: 0.9847 - val_loss: 2.0083 - val_accuracy: 0.7440
Epoch 46/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0455 - accuracy: 0.9853 - val_loss: 2.2283 - val_accuracy: 0.7169
Epoch 47/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0458 - accuracy: 0.9860 - val_loss: 3.6712 - val_accuracy: 0.6340
Epoch 48/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0478 - accuracy: 0.9854 - val_loss: 2.3555 - val_accuracy: 0.7247
Epoch 49/50
782/782 [==============================] - 10s 13ms/step - loss: 0.0446 - accuracy: 0.9864 - val_loss: 2.4462 - val_accuracy: 0.6954
Epoch 50/50
782/782 [==============================] - 9s 12ms/step - loss: 0.0423 - accuracy: 0.9864 - val_loss: 2.2708 - val_accuracy: 0.7243
Let’s check the predictions, and get precision, recall, and F1 score:
predictions = model.predict(x=test_x, batch_size=64)
print(classification_report(test_y.reshape(10000, ),
predictions.argmax(axis=1)))
Output:
precision recall f1-score support
0 0.81 0.70 0.75 1000
1 0.91 0.77 0.84 1000
2 0.58 0.69 0.63 1000
3 0.52 0.57 0.55 1000
4 0.68 0.70 0.69 1000
5 0.64 0.61 0.62 1000
6 0.77 0.78 0.78 1000
7 0.76 0.80 0.78 1000
8 0.84 0.80 0.82 1000
9 0.81 0.82 0.81 1000
accuracy 0.72 10000
macro avg 0.73 0.72 0.73 10000
weighted avg 0.73 0.72 0.73 10000
Finally, we got an accuracy of 72%.
If you want, you can try hyperparameter search using some different parameters. For example, max_epochs in Hyperband method was 10 in this exercise you may use 20. Or in tuner.search() method I used 30 epochs. Please feel free to try a different number and see if the accuracy changes.
Let’s use the plot_h() function to plot the results:
plot_h(H)

This plot is pretty self-explanatory. Please feel free to have a look carefully.
Conclusion
In this tutorial, I wanted to introduce the Keras tuner for hyperparameter tuning. I used only one tuner method. I will make tutorials on the other tuners available in the Keras in the future. Please experiment with it and hopefully, you will find it useful.