How to download all available values from DynamoDB using pagination
A common problem I noticed in various applications was forgetting that DynamoDB supports pagination too. Somehow, when developers see more than ten results, they assume that they receive everything ;)
How do we retrieve all values from DynamoDB when performing a query?
We have to extract the LastEvaluatedKey
from the response and use it as the ExclusiveStartKey
in the subsequent query. In this article, I show how to do it when we use the AwsDynamoDBHook
in Airflow:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from boto3.dynamodb.conditions import Attr
from airflow.contrib.hooks.aws_dynamodb_hook import AwsDynamoDBHook
query_params = {
'FilterExpression': Attr('some_field').eq('value'), 'ConsistentRead': True
}
hook = AwsDynamoDBHook('primary_key_name', 'table_name', 'aws_region')
connection = hook.get_conn()
table = connection.Table('table_name')
response = table.scan(**query_params)
entries = list()
for item in response['Items']:
entries.append(item)
while 'LastEvaluatedKey' in response:
response = table.scan(**query_params, ExclusiveStartKey=response['LastEvaluatedKey'])
for item in response['Items']:
entries.append(item)
output = iter(entries)
You may also like
Remember to share on social media! If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.
If you want to contact me, send me a message on LinkedIn or Twitter.
Would you like to have a call and talk? Please schedule a meeting using this link.