How to assign rows to ranked groups in AWS Athena

A ranked group is a method of grouping rows in the following way:

  • First, we have to order the rows by a column.
  • After that, we determine how many rows should be in every group. All of the groups (except the last one) must have the same number of rows.
  • In the end, we traverse the ordered rows and assign them to groups one by one.

For example, let’s assume that I have numbers 3, 5, 1, 8, 2, 4, 7, 9, 0, and I want to assign them to three groups.

In the first step, I have to order them 0, 1, 2, 3, 4, 5, 7, 8, 9. I calculate the size of the group. I have nine numbers, and I want three groups, so every group contains three numbers. Finally, I assign the numbers to groups:

  • Group I: 0, 1, 2
  • Group II, 3, 4, 5
  • Group III: 7, 8, 9

How do we do that in Athena? Fortunately, there is a built-in function NTILE:

SELECT t.*,,
       NTILE(3) OVER (ORDER BY col)
FROM table t;
Older post

How to define an AWS Athena view using Airflow

How to use the AWSAthenaOperator

Newer post

How to generate a sequence of dates in Redshift

How to use the generate_series function to generate a sequence of dates