Smoothing time series in Python using Savitzky–Golay filter

In this article, I will show you how to use the Savitzky-Golay filter in Python and show you how it works. To understand the Savitzky–Golay filter, you should be familiar with the moving average and linear regression.

The Savitzky-Golay filter has two parameters: the window size and the degree of the polynomial.

The window size parameter specifies how many data points will be used to fit a polynomial regression function. The second parameter specifies the degree of the fitted polynomial function (if we choose 1 as the polynomial degree, we end up using a linear regression function).

In every window, a new polynomial is fitted, which gives us the effect of smoothing the input dataset.

Take a look at the following animation (Source: Wikipedia Author: Cdang, Licence: CC BY‑SA 3.0)

In every step, the window moves and a different part of the original dataset is used. Then, the local polynomial function is fitted to the data in the window, and a new data point is calculated using the polynomial function. After that, the window moves to the next part of the dataset, and the process repeats.



Python

Here is a dataset of Bitcoin prices during the days between 2019-07-19 and 2019-08-17.

1
2
3
4
bitcoin.plot()
plt.title('Bitcoin price: 2019-07-19 - 2019-08-17')
plt.xlabel('Day')
plt.ylabel('BTC price in USD')

I’m going to smooth the data in 5 days-long windows using a first-degree polynomial and a second-degree polynomial.

1
2
3
4
5
6
7
8
9
10
from scipy.signal import savgol_filter

smoothed_2dg = savgol_filter(btc, window_length = 5, polyorder = 2)
smoothed_2dg

smoothed_1dg = savgol_filter(btc, window_length = 5, polyorder = 1)
smoothed_1dg

bitcoin['smoothed_2dg'] = smoothed_2dg
bitcoin['smoothed_1dg'] = smoothed_1dg

When we plot the result, we see the original data, and the two smoothed time-series.

1
2
3
4
bitcoin.plot()
plt.title('Bitcoin price: 2019-07-19 - 2019-08-17')
plt.xlabel('Day')
plt.ylabel('BTC price in USD')
Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.

Bartosz Mikulski

Bartosz Mikulski

  • Data/MLOps engineer by day
  • DevRel/copywriter by night
  • Python and data engineering trainer
  • Conference speaker
  • Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
  • Twitter: @mikulskibartosz
Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.