How to assign rows to ranked groups in AWS Athena
A ranked group is a method of grouping rows in the following way:
- First, we have to order the rows by a column.
- After that, we determine how many rows should be in every group. All of the groups (except the last one) must have the same number of rows.
- In the end, we traverse the ordered rows and assign them to groups one by one.
For example, let’s assume that I have numbers 3, 5, 1, 8, 2, 4, 7, 9, 0, and I want to assign them to three groups.
In the first step, I have to order them 0, 1, 2, 3, 4, 5, 7, 8, 9. I calculate the size of the group. I have nine numbers, and I want three groups, so every group contains three numbers. Finally, I assign the numbers to groups:
- Group I: 0, 1, 2
- Group II, 3, 4, 5
- Group III: 7, 8, 9
How do we do that in Athena? Fortunately, there is a built-in function NTILE:
1 2 3 SELECT t.*,, NTILE(3) OVER (ORDER BY col) FROM table t;
You may also like
- What is s3:TestEvent, and why does it break my event processing?
- How to populate a PostgreSQL (RDS) database with data from CSV files stored in AWS S3
- How to retrieve the table descriptions from Glue Data Catalog using boto3
- How to use WHEN CASE queires in AWS Athena
- How to deploy a REST API AWS Lambda using Chalice and AWS Code Pipeline