Smoothing time series in Pandas

To make time series data more smooth in Pandas, we can use the exponentially weighted window functions and calculate the exponentially weighted average.

First, I am going to load a dataset which contains Bitcoin prices recorded every minute.

1
2
3
4
5
6
7
data = pd.read_csv('../input/bitstampUSD_1-min_data_2012-01-01_to_2019-03-13.csv')
data['date'] = pd.to_datetime(data['Timestamp'], unit="s")

input_data = data[["date", "Close"]]

subset = input_data[input_data["date"] >= "2019-01-01"]
subset.set_index('date', inplace=True)

I want to plot their daily weighted average, so I must compress 3600 values into one using this function:

1
subset['Close'].ewm(span = 3600).mean()

We see that by default the adjusted version of the weighted average function is used, so the first element of the time series is not 0.

Are you interested in data engineering?

Check out my other blog https://easydata.engineering

Finally, I can plot the original data and both the smoothed time series:

1
2
3
4
5
6
7
subset['Close'].plot(style = 'r--', label = 'Bitcoin prices')
subset['Close'].ewm(span = 3600).mean().plot(style = 'b', label = ' Exponential moving average')

plt.legend()
plt.title("Bitcoin prices")
plt.xlabel('Date')
plt.ylabel('Price (USD)')

Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you watch programming live streams, check out my YouTube channel.
You can also follow me on Twitter: @mikulskibartosz

If you want to hire me, send me a message on LinkedIn or Twitter.


Bartosz Mikulski
Bartosz Mikulski * data scientist / software/data engineer * conference speaker * organizer of School of A.I. meetups in Poznań * co-founder of Software Craftsmanship Poznan & Poznan Scala User Group