Smoothing time series in Pandas

To make time series data more smooth in Pandas, we can use the exponentially weighted window functions and calculate the exponentially weighted average.

First, I am going to load a dataset which contains Bitcoin prices recorded every minute.

1
2
3
4
5
6
7
data = pd.read_csv('../input/bitstampUSD_1-min_data_2012-01-01_to_2019-03-13.csv')
data['date'] = pd.to_datetime(data['Timestamp'], unit="s")

input_data = data[["date", "Close"]]

subset = input_data[input_data["date"] >= "2019-01-01"]
subset.set_index('date', inplace=True)

I want to plot their daily weighted average, so I must compress 3600 values into one using this function:

1
subset['Close'].ewm(span = 3600).mean()

We see that by default the adjusted version of the weighted average function is used, so the first element of the time series is not 0.



Finally, I can plot the original data and both the smoothed time series:

1
2
3
4
5
6
7
subset['Close'].plot(style = 'r--', label = 'Bitcoin prices')
subset['Close'].ewm(span = 3600).mean().plot(style = 'b', label = ' Exponential moving average')

plt.legend()
plt.title("Bitcoin prices")
plt.xlabel('Date')
plt.ylabel('Price (USD)')

Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you watch programming live streams, check out my YouTube channel.
You can also follow me on Twitter: @mikulskibartosz

If you want to hire me, send me a message on LinkedIn or Twitter.


Bartosz Mikulski
Bartosz Mikulski * big data engineer * conference speaker * co-founder of Software Craftsmanship Poznan & Poznan Scala User Group