Test-driven development in Jupyter Notebook

In the blog post “5 reasons why jupyter notebooks suckAlexander Mueller complained that code written in notebooks is difficult to test.

Unfortunately, that is true, but I want to show you how to work around that issue and get something that resembles test-driven development in notebooks.

First of all, we must get used to the fact that there is no support for any testing library used in Python. The only things we can use, are the testing capabilities built-in Python, which means the “assert” keyword.

That’s it. The only thing we have is the ability to check if an expression returns True.

The second problem is the way Jupyter displays assertion errors. It looks like every other type of error. We must get used to that too.

There are dozens of books about test-driven development itself, so I am not going to focus on the method. Instead of that, I will show you the tools.

The “assert” keyword

How does the “assert” work? The assert’s parameters consist of two parts. The first part is the boolean expression that is supposed to be True. The second parameter is the error message to be displayed if the boolean expression evaluates to False. We can use the second parameter to give the test a name or describe what should happen.

Let’s look at an example. In this case, the assertion fails, and we get an error.

1
2
3
4
def multiplyByTwo(x):
    return x * 3

assert multiplyByTwo(2) == 4, "2 multiplied by 2 should be equal 4"

Unfortunately, there is no way to get the value returned by the function and display it with the error message unless you call the same function twice.

1
2
3
4
def multiplyByTwo(x):
    return x * 3

assert multiplyByTwo(2) == 4, "2 multiplied by 2 should be equal 4, but the function returned: " +  str(multiplyByTwo(2))

It is trivial in this case, but if the function does some heavy computation may take some time to get the error message. You can avoid this problem by assigning the output to a variable and using the variable in the assert clause.

1
2
3
4
5
def multiplyByTwo(x):
    return x * 3

result = multiplyByTwo(2)
assert result == 4, "2 multiplied by 2 should be equal 4, but the function returned: " +  str(result)

What happens if the test passes? That is kind of sad, but it does not display anything.

You won’t see a green progress bar or an encouraging message like: “All tests passed.” If you have done everything right, all you get is an empty output.

Did you enjoy reading this article?
Would you like to learn more about software craft in data engineering and MLOps?

Subscribe to the newsletter or add this blog to your RSS reader (does anyone still use them?) to get a notification when I publish a new essay!

Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.

Bartosz Mikulski

Bartosz Mikulski

  • Data/MLOps engineer by day
  • DevRel/copywriter by night
  • Python and data engineering trainer
  • Conference speaker
  • Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
  • Twitter: @mikulskibartosz
Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.