Using keras-tuner to tune hyperparameters of a TensorFlow model

Using keras-tuner to tune hyperparameters of a TensorFlow model

In this article, I am going to show how to use the random search hyperparameter tuning method with Keras. I decided to use the keras-tuner project, which at the time of writing the article has not been officially released yet, so I have to install it directly from the GitHub repository.

1
2
3
#remove ! if your are not running it in Jupyter Notebook
!git clone https://github.com/keras-team/keras-tuner.git
!pip install ./keras-tuner

As an example, I will use the Fashion-MNIST dataset, so the goal is to perform a multiclass classification of images. First, I have to load the training and test dataset. Fashion-MNIST is available as one of the Keras built-in datasets, so the following code downloads everything I need.

1
2
3
4
5
6
7
import tensorflow as tf
from tensorflow import keras
import numpy as np

fashion_mnist = keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

The images have been already preprocessed, so currently, the dataset contains one channel (gray-scale) of color values in the range 0-255). I want to scale the values to range between 0 and 1, so I divide them by 255.

1
2
train_images = train_images / 255.0
test_images = test_images / 255.0

I am going to reshape the dataset to use it as an input of the convolutional layer.

1
2
train_images = train_images.reshape(len(train_images), 28, 28, 1)
test_images = test_images.reshape(len(test_images), 28, 28, 1)

Parameters

Keras-tuner needs a function that accepts the set of parameters and returns a compiled model, so I have to define such function.

There are four kinds of parameters available: range, choice, linear, and fixed.

Range

The range returns integer values between the given minimum and maximum. The values are incremented by the step parameter.

1
hp.Range('conv_1_filter', min_value=64, max_value=128, step=16)

Linear

The liner parameter is similar to the range but works with float numbers. In this case, the step is called resolution.

1
hp.Linear('learning_rate', min_value=0.01, max_value=0.1, resolution=0.1)

Choice

The choice parameter is much simpler. We give it a list of values, and it returns one of them.

1
hp.Choice('learning_rate', values=[1e-2, 1e-3])

Fixed

Finally, we can set a constant as the parameter value. It is useful when we want to let keras-tuner tune all parameters except one. The fixed parameter works only with the predefined models: Xception and ResNet.

1
hp.Fixed('learning_rate', value=1e-4)

How to define the model

Here is my function that builds a neural network using the parameters given by keras-tuner. Even though it is not necessary in this case, I will parameterize all layers and the learning rate, to show that it is possible.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def build_model(hp):  
  model = keras.Sequential([
    keras.layers.Conv2D(
        filters=hp.Range('conv_1_filter', min_value=64, max_value=128, step=16),
        kernel_size=hp.Choice('conv_1_kernel', values = [3,5]),
        activation='relu',
        input_shape=(28,28,1)
    ),
    keras.layers.Conv2D(
        filters=hp.Range('conv_2_filter', min_value=32, max_value=64, step=16),
        kernel_size=hp.Choice('conv_2_kernel', values = [3,5]),
        activation='relu'
    ),
    keras.layers.Flatten(),
    keras.layers.Dense(
        units=hp.Range('dense_1_units', min_value=32, max_value=128, step=16),
        activation='relu'
    ),
    keras.layers.Dense(10, activation='softmax')
  ])
  
  model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', values=[1e-2, 1e-3])),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
  
  return model

Configure the tuner

When the function is ready, I have to configure the tuner. We need to specify the objective, which is the metric used to compare models. In this case, I want to use validation set accuracy.

The other important parameter is the number of trails. That parameter tells the tuner how many hyperparameter combinations it has to test.

I must also specify the name and the output directory. It tells the tuner where it should store the debugging data.

Note that I passed the function defined above as the first parameter!

1
2
3
4
5
6
7
8
from kerastuner.tuners import RandomSearch

tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    directory='output',
    project_name='FashionMNIST')

Hyperparameter tuning

Now, I have a configured tuner. It is time to run it. I need the training datasets, and the number of epochs is every trial. I must also specify the validation dataset or the percentage of training dataset that will be used for validation.

I call the search function, and eventually, I will get the results of the tuning.

1
tuner.search(train_images, train_labels, epochs=2, validation_split=0.1)

Using the model

When the search is done, I can get the best model and either start using it or continue training.

1
model = tuner.get_best_models(num_models=1)[0]

In this example, I trained the model for only two epochs, so I will continue training it, starting from the third epoch.

1
model.fit(train_images, train_labels, epochs=10, validation_split=0.1, initial_epoch=2)

Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you watch programming live streams, check out my YouTube channel.
You can also follow me on Twitter: @mikulskibartosz

If you want to hire me, send me a message on LinkedIn or Twitter.


Bartosz Mikulski
Bartosz Mikulski * data scientist / software/data engineer * conference speaker * organizer of School of A.I. meetups in Poznań * co-founder of Software Craftsmanship Poznan & Poznan Scala User Group