Using keras-tuner to tune hyperparameters of a TensorFlow model

Using keras-tuner to tune hyperparameters of a TensorFlow model

In this article, I am going to show how to use the random search hyperparameter tuning method with Keras. I decided to use the keras-tuner project, which at the time of writing the article has not been officially released yet, so I have to install it directly from the GitHub repository.

1
2
3
#remove ! if your are not running it in Jupyter Notebook
!git clone https://github.com/keras-team/keras-tuner.git
!pip install ./keras-tuner

As an example, I will use the Fashion-MNIST dataset, so the goal is to perform a multiclass classification of images. First, I have to load the training and test dataset. Fashion-MNIST is available as one of the Keras built-in datasets, so the following code downloads everything I need.

1
2
3
4
5
6
7
import tensorflow as tf
from tensorflow import keras
import numpy as np

fashion_mnist = keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

The images have been already preprocessed, so currently, the dataset contains one channel (gray-scale) of color values in the range 0-255). I want to scale the values to range between 0 and 1, so I divide them by 255.

1
2
train_images = train_images / 255.0
test_images = test_images / 255.0

I am going to reshape the dataset to use it as an input of the convolutional layer.

1
2
train_images = train_images.reshape(len(train_images), 28, 28, 1)
test_images = test_images.reshape(len(test_images), 28, 28, 1)

Parameters

Keras-tuner needs a function that accepts the set of parameters and returns a compiled model, so I have to define such function.

There are four kinds of parameters available: range, choice, linear, and fixed.

Do you want to show your product/service to 25000 data science enthusiasts every month? I am looking for companies which would like to become a partner of this blog.

Are you interested? Is your employer interested? Here are the details of the offer.

Range

The range returns integer values between the given minimum and maximum. The values are incremented by the step parameter.

1
hp.Range('conv_1_filter', min_value=64, max_value=128, step=16)

Linear

The liner parameter is similar to the range but works with float numbers. In this case, the step is called resolution.

1
hp.Linear('learning_rate', min_value=0.01, max_value=0.1, resolution=0.1)

Choice

The choice parameter is much simpler. We give it a list of values, and it returns one of them.

1
hp.Choice('learning_rate', values=[1e-2, 1e-3])

Fixed

Finally, we can set a constant as the parameter value. It is useful when we want to let keras-tuner tune all parameters except one. The fixed parameter works only with the predefined models: Xception and ResNet.

1
hp.Fixed('learning_rate', value=1e-4)

How to define the model

Here is my function that builds a neural network using the parameters given by keras-tuner. Even though it is not necessary in this case, I will parameterize all layers and the learning rate, to show that it is possible.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def build_model(hp):  
  model = keras.Sequential([
    keras.layers.Conv2D(
        filters=hp.Range('conv_1_filter', min_value=64, max_value=128, step=16),
        kernel_size=hp.Choice('conv_1_kernel', values = [3,5]),
        activation='relu',
        input_shape=(28,28,1)
    ),
    keras.layers.Conv2D(
        filters=hp.Range('conv_2_filter', min_value=32, max_value=64, step=16),
        kernel_size=hp.Choice('conv_2_kernel', values = [3,5]),
        activation='relu'
    ),
    keras.layers.Flatten(),
    keras.layers.Dense(
        units=hp.Range('dense_1_units', min_value=32, max_value=128, step=16),
        activation='relu'
    ),
    keras.layers.Dense(10, activation='softmax')
  ])
  
  model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', values=[1e-2, 1e-3])),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
  
  return model

Configure the tuner

When the function is ready, I have to configure the tuner. We need to specify the objective, which is the metric used to compare models. In this case, I want to use validation set accuracy.

The other important parameter is the number of trails. That parameter tells the tuner how many hyperparameter combinations it has to test.

I must also specify the name and the output directory. It tells the tuner where it should store the debugging data.

Note that I passed the function defined above as the first parameter!

1
2
3
4
5
6
7
8
from kerastuner.tuners import RandomSearch

tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=5,
    directory='output',
    project_name='FashionMNIST')

Hyperparameter tuning

Now, I have a configured tuner. It is time to run it. I need the training datasets, and the number of epochs is every trial. I must also specify the validation dataset or the percentage of training dataset that will be used for validation.

I call the search function, and eventually, I will get the results of the tuning.

1
tuner.search(train_images, train_labels, epochs=2, validation_split=0.1)

Using the model

When the search is done, I can get the best model and either start using it or continue training.

1
model = tuner.get_best_models(num_models=1)[0]

In this example, I trained the model for only two epochs, so I will continue training it, starting from the third epoch.

1
model.fit(train_images, train_labels, epochs=10, validation_split=0.1, initial_epoch=2)

Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you watch programming live streams, check out my YouTube channel.
You can also follow me on Twitter: @mikulskibartosz

If you want to hire me, send me a message on LinkedIn or Twitter.


If this article was helpful, consider donating to WWF or any other charity of your choice.
Bartosz Mikulski
Bartosz Mikulski * data scientist / software engineer * conference speaker * organizer of School of A.I. meetups in Poznań * co-founder of Software Craftsmanship Poznan & Poznan Scala User Group