Making OFFSET LIMIT queries in AWS Athena

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (68/100)

There is no OFFSET support in AWS Athena, but we can use a workaround to get the same behavior.

Let’s assume that I want to execute this query:

1
2
3
SELECT * FROM some_table
ORDER BY column_A
OFFSET 20 LIMIT 10 -- this doesn't work

to get the desired outcome, I need three things:

  • a window function that assigns a number to every row
  • ordering method that sorts the table, so I get a deterministic outcome
  • filtering function that selects only the rows I want
1
2
3
SELECT * FROM (
    SELECT row_number() over(ORDER BY column_A) AS rn, * FROM some_table)
WHERE rn BETWEEN 20 AND 30; -- 30 = 20 (offset) + 10 (limit)

Subscribe to the newsletter and join the free email course.


Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Would you like to have a call and talk? Please schedule a meeting using this link.


Bartosz Mikulski
Bartosz Mikulski * MLOps Engineer / data engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group

Subscribe to the newsletter and get access to my free email course on building trustworthy data pipelines.