How to sort a Pandas DataFrame by month name

The alphabetical order makes no sense when we sort a Pandas DataFrame by month names. How can we sort the values by months while keeping the proper order and still having month names in the column?

The simplest option may be to add a column with the month position in the calendar and sort by it. However, to me, it looks like an ugly hack. There must be a better way.

First, let’s create a Pandas DataFrame and mix the order of months:

1
2
3
4
5
6
7
8
import pandas as pd
df = pd.DataFrame({
    'date': ['2022-08-01', '2022-09-01', '2022-10-01', '2022-11-01', '2022-12-01'],
    'value': [123, 456, 789, 345, 678]
})

df['month'] = pd.to_datetime(df['date']).apply(lambda x: x.month_name())
df = df.sort_values('value')

Now, the df variable contains out-of-order variables:

1
2
3
4
5
6
7
| date       |   value | month     |
|:-----------|--------:|:----------|
| 2022-08-01 |     123 | August    |
| 2022-11-01 |     345 | November  |
| 2022-09-01 |     456 | September |
| 2022-12-01 |     678 | December  |
| 2022-10-01 |     789 | October   |

Let’s fix it.

The wrong way - putting months in the alphabetical order

To verify that we have a problem sorting the months, let’s try using the sort_values method on the month column. It will sort them alphabetically by names. That’s probably the worst way to display time-based data.

1
df.sort_values('month')

The result is a disaster:

1
2
3
4
5
6
7
| date       |   value | month     |
|:-----------|--------:|:----------|
| 2022-08-01 |     123 | August    |
| 2022-12-01 |     678 | December  |
| 2022-11-01 |     345 | November  |
| 2022-10-01 |     789 | October   |
| 2022-09-01 |     456 | September |

Sorting months by names in Pandas

To sort the months correctly, we must define the order first. We could create a list of months in the correct order by hand, but using the date_range function seems easier:

1
dates_in_order = pd.date_range(start='2022-01-01', end='2022-12-01', freq='MS')

As a result, we have a DataFrameIndex with the first days of every month in 2022:

1
2
3
4
DatetimeIndex(['2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01',
               '2022-05-01', '2022-06-01', '2022-07-01', '2022-08-01',
               '2022-09-01', '2022-10-01', '2022-11-01', '2022-12-01'],
              dtype='datetime64[ns]', freq='MS')

Now, we can extract the month name from each date:

1
months_in_order = dates_in_order.map(lambda x: x.month_name()).to_list()

We have created a list of months:

1
2
3
4
5
6
7
8
9
10
11
12
['January',
 'February',
 'March',
 'April',
 'May',
 'June',
 'July',
 'August',
 'September',
 'October',
 'November',
 'December']

Now, let’s use the list of month names to define the month column as an ordered categorical variable:

1
2
3
4
5
df.month = pd.Categorical(
    df.month,
    categories=months_in_order,
    ordered=True
)

Our DataFrame is still not sorted correctly, but this time when we use the sort_values function, we will get the expected result:

1
df.sort_values('month')
1
2
3
4
5
6
7
| date       |   value | month     |
|:-----------|--------:|:----------|
| 2022-08-01 |     123 | August    |
| 2022-09-01 |     456 | September |
| 2022-10-01 |     789 | October   |
| 2022-11-01 |     345 | November  |
| 2022-12-01 |     678 | December  |

Did you enjoy reading this article?
Would you like to learn more about software craft in data engineering and MLOps?

Subscribe to the newsletter or add this blog to your RSS reader (does anyone still use them?) to get a notification when I publish a new essay!

Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.

Bartosz Mikulski

Bartosz Mikulski

  • Data/MLOps engineer by day
  • DevRel/copywriter by night
  • Python and data engineering trainer
  • Conference speaker
  • Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
  • Twitter: @mikulskibartosz
Newsletter

Do you enjoy reading my articles?
Subscribe to the newsletter if you don't want to miss the new content, business offers, and free training materials.