# Using keras-tuner to tune hyperparameters of a TensorFlow model

In this article, I am going to show how to use the random search hyperparameter tuning method with Keras. I decided to use the keras-tuner project, which at the time of writing the article has not been officially released yet, so I have to install it directly from the GitHub repository.

1
2
3

#remove ! if your are not running it in Jupyter Notebook
!git clone https://github.com/keras-team/keras-tuner.git
!pip install ./keras-tuner

As an example, I will use the Fashion-MNIST dataset, so the goal is to perform a multiclass classification of images. First, I have to load the training and test dataset. Fashion-MNIST is available as one of the Keras built-in datasets, so the following code downloads everything I need.

1
2
3
4
5
6
7

import tensorflow as tf
from tensorflow import keras
import numpy as np
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

The images have been already preprocessed, so currently, the dataset contains one channel (gray-scale) of color values in the range 0-255). I want to scale the values to range between 0 and 1, so I divide them by 255.

1
2

train_images = train_images / 255.0
test_images = test_images / 255.0

I am going to reshape the dataset to use it as an input of the convolutional layer.

1
2

train_images = train_images.reshape(len(train_images), 28, 28, 1)
test_images = test_images.reshape(len(test_images), 28, 28, 1)

# Parameters

Keras-tuner needs a function that accepts the set of parameters and returns a compiled model, so I have to define such function.

There are four kinds of parameters available: range, choice, linear, and fixed.

## Would you like to __help fight youth unemployment__ while getting mentoring experience?

**Develhope is looking for tutors (part-time, freelancers) for their upcoming Data Engineer Courses.**

The role of a tutor is to be the point of contact for students, guiding them throughout the 6-month learning program. The mentor supports learners through 1:1 meetings, giving feedback on assignments, and responding to messages in Discord channels—no live teaching sessions.

Expected availability: 15h/week. You can schedule the 1:1 sessions whenever you want, but the sessions must happen between 9 - 18 (9 am - 6 pm) CEST Monday-Friday.

**Check out their job description.**

(free advertisement, no affiliate links)

## Range

The range returns integer values between the given minimum and maximum. The values are incremented by the step parameter.

1

hp.Range('conv_1_filter', min_value=64, max_value=128, step=16)

## Linear

The liner parameter is similar to the range but works with float numbers. In this case, the step is called resolution.

1

hp.Linear('learning_rate', min_value=0.01, max_value=0.1, resolution=0.1)

## Choice

The choice parameter is much simpler. We give it a list of values, and it returns one of them.

1

hp.Choice('learning_rate', values=[1e-2, 1e-3])

## Fixed

Finally, we can set a constant as the parameter value. It is useful when we want to let keras-tuner tune all parameters except one. The fixed parameter works only with the predefined models: Xception and ResNet.

1

hp.Fixed('learning_rate', value=1e-4)

# How to define the model

Here is my function that builds a neural network using the parameters given by keras-tuner. Even though it is not necessary in this case, I will parameterize all layers and the learning rate, to show that it is possible.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

def build_model(hp):
model = keras.Sequential([
keras.layers.Conv2D(
filters=hp.Range('conv_1_filter', min_value=64, max_value=128, step=16),
kernel_size=hp.Choice('conv_1_kernel', values = [3,5]),
activation='relu',
input_shape=(28,28,1)
),
keras.layers.Conv2D(
filters=hp.Range('conv_2_filter', min_value=32, max_value=64, step=16),
kernel_size=hp.Choice('conv_2_kernel', values = [3,5]),
activation='relu'
),
keras.layers.Flatten(),
keras.layers.Dense(
units=hp.Range('dense_1_units', min_value=32, max_value=128, step=16),
activation='relu'
),
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer=keras.optimizers.Adam(hp.Choice('learning_rate', values=[1e-2, 1e-3])),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model

# Configure the tuner

When the function is ready, I have to configure the tuner. We need to specify the objective, which is the metric used to compare models. In this case, I want to use validation set accuracy.

The other important parameter is the number of trails. That parameter tells the tuner how many hyperparameter combinations it has to test.

I must also specify the name and the output directory. It tells the tuner where it should store the debugging data.

Note that I passed the function defined above as the first parameter!

1
2
3
4
5
6
7
8

from kerastuner.tuners import RandomSearch
tuner = RandomSearch(
build_model,
objective='val_accuracy',
max_trials=5,
directory='output',
project_name='FashionMNIST')

# Hyperparameter tuning

Now, I have a configured tuner. It is time to run it. I need the training datasets, and the number of epochs is every trial. I must also specify the validation dataset or the percentage of training dataset that will be used for validation.

I call the search function, and eventually, I will get the results of the tuning.

1

tuner.search(train_images, train_labels, epochs=2, validation_split=0.1)

# Using the model

When the search is done, I can get the best model and either start using it or continue training.

1

model = tuner.get_best_models(num_models=1)[0]

In this example, I trained the model for only two epochs, so I will continue training it, starting from the third epoch.

1

model.fit(train_images, train_labels, epochs=10, validation_split=0.1, initial_epoch=2)

You may also like

- Understanding layer size in Convolutional Neural Networks
- The optimal learning rate during fine-tuning of an artificial neural network
- Ludwig machine learing model in Kaggle
- Using Boltzmann distribution as the exploration policy in TensorFlow-agent reinforcement learning models
- Understanding the softmax activation function