How to set the global random_state in Scikit Learn
Such information should be in the first paragraph of Scikit Learn manual, but it is hidden somewhere in the FAQ, so let’s write about it here.
Scikit Learn does not have its own global random state but uses the numpy random state instead. If you want to have reproducible results in Jupyter Notebook (you should want that ;) ), set the seed at the beginning of your notebook:
1
np.random.seed(31415)
How can we check if it works? Run this code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import numpy as np
print('Without seed')
print(norm.rvs(10, size = 4))
print(norm.rvs(10, size = 4))
print('With the same seed')
np.random.seed(31415)
print(norm.rvs(10, size = 4))
np.random.seed(31415) # reset the random seed back to 31415
print(norm.rvs(10, size = 4))
print('Without seed')
np.random.seed(None)
print(norm.rvs(10, size = 4))
print(norm.rvs(10, size = 4)
In my case the output was:
1
2
3
4
5
6
7
8
9
Without seed
[11.87381912 10.67665352 10.93843519 9.68574986]
[10.16669138 9.41330164 9.64055638 8.49694282]
With the same seed
[11.36242188 11.13410818 12.36307449 9.74043318]
[11.36242188 11.13410818 12.36307449 9.74043318]
Without seed
[ 8.79608103 9.40920579 11.23146236 10.18055655]
[11.5560791 9.77978961 11.9580387 11.39481905]
Did you enjoy reading this article?
Would you like to learn more about leveraging AI to drive growth and innovation, software craft in data engineering, and MLOps?
Subscribe to the newsletter or add this blog to your RSS reader (does anyone still use them?) to get a notification when I publish a new essay!
You may also like

Bartosz Mikulski
- MLOps engineer by day
- AI and data engineering consultant by night
- Python and data engineering trainer
- Conference speaker
- Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
- Twitter: @mikulskibartosz
- Mastodon: @mikulskibartosz@mathstodon.xyz