How to download all available values from DynamoDB using pagination

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (41/100)

A common problem I noticed in various applications was forgetting that DynamoDB supports pagination too. Somehow, when developers see more than ten results, they assume that they receive everything ;)

How do we retrieve all values from DynamoDB when performing a query?

We have to extract the LastEvaluatedKey from the response and use it as the ExclusiveStartKey in the subsequent query. In this article, I show how to do it when we use the AwsDynamoDBHook in Airflow:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from boto3.dynamodb.conditions import Attr
from airflow.contrib.hooks.aws_dynamodb_hook import AwsDynamoDBHook

query_params = {
    'FilterExpression': Attr('some_field').eq('value'), 'ConsistentRead': True
}

hook = AwsDynamoDBHook('primary_key_name', 'table_name', 'aws_region')
connection = hook.get_conn()
table = connection.Table('table_name')

response = table.scan(**query_params)

entries = list()

for item in response['Items']:
    entries.append(item)

while 'LastEvaluatedKey' in response:
    response = table.scan(**query_params, ExclusiveStartKey=response['LastEvaluatedKey'])
    for item in response['Items']:
        entries.append(item)

output = iter(entries)

Subscribe to the newsletter and join the free email course.


Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Would you like to have a call and talk? Please schedule a meeting using this link.


Bartosz Mikulski
Bartosz Mikulski * data/machine learning engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group

Subscribe to the newsletter and get access to my free email course on building trustworthy data pipelines.

Do you want to work with me at riskmethods?

REMOTE position (available in Poland or Germany)