Using a surrogate model to interpret a machine learning model

Using a surrogate model to interpret a machine learning model

In my opinion, training a surrogate model is the easiest method of interpreting the behavior of an existing machine learning model.

To apply this method, we are going to need:

  • an existing machine learning model
  • input data that can be processed by the existing model (for example the test dataset used for training the model or a sample of real-world data from the production environment)

We don’t need to know anything about the existing model. It is just a black box. It has an input, and when we pass the data, we get an output. That is all we need.

In the first step, I am going to pass the data into the black box model and get the prediction.

1
2
3
4
5
6
          =======================
          =                     =
data =>   =     black box       =  => prediction
          =       model         =
          =======================

Now, I have to decide what kind of model I want to train as the surrogate model. It should be a model that I know how to interpret and explain to people who have no machine learning knowledge, for example, linear regression or decision trees.

Do you want to show your product/service to 25000 data science enthusiasts every month? I am looking for companies which would like to become a partner of this blog.

Are you interested? Is your employer interested? Here are the details of the offer.

I am going to train the surrogate model, using the independent variables from input data and the prediction from the black box as the dependent variable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
independent variables       prediction
from input dataset          from black box model
              ||              ||
              ||              ||
              \/              \/
          =======================
          =                     =
          =     surrogate       =
          =       model         =
          =======================
                    ||
                    \/
                 surrogate's
                 prediction

After that, I can calculate the prediction error of the surrogate model and compare it with the predictions of the black box. The smaller the error I get, the better the surrogate model explains the black box.

When I get a surrogate model which has an acceptable prediction error, I can look at its parameters to understand which features are important and how the black box model works.


Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you watch programming live streams, check out my YouTube channel.
You can also follow me on Twitter: @mikulskibartosz

If you want to hire me, send me a message on LinkedIn or Twitter.


If this article was helpful, consider donating to WWF or any other charity of your choice.
Bartosz Mikulski
Bartosz Mikulski * data scientist / software engineer * conference speaker * organizer of School of A.I. meetups in Poznań * co-founder of Software Craftsmanship Poznan & Poznan Scala User Group