How to perform a batch write to DynamoDB using boto3

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (17/100)

This article will show you how to store rows of a Pandas DataFrame in DynamoDB using the batch write operations.

First, we have to create a DynamoDB client:

1
2
3
4
import boto3

dynamodb = boto3.resource('dynamodb', aws_access_key_id='', aws_secret_access_key='')
table = dynamodb.Table('table_name')

When the connection handler is ready, we must create a batch writer using the with statement:

1
2
with table.batch_writer() as batch:
   pass # we will change that

Now, we can create an iterator over the Pandas DataFrame inside the with block:

1
2
3
with table.batch_writer() as batch:
    for index, row in df.iterrows():
        pass # to be changed


We will extract the fields we want to store in DynamoDB and put them in a dictionary in the loop:

1
2
3
4
5
6
7
with table.batch_writer() as batch:
    for index, row in df.iterrows():
        content = {
            'field_A', row['A'],
            'field_B', row['B']
        }
        # there is still something missing

In the end, we use the put_item function to add the item to the batch:

1
2
3
4
5
6
7
with table.batch_writer() as batch:
    for index, row in df.iterrows():
        content = {
            'field_A', row['A'],
            'field_B', row['B']
        }
        batch.put_item(Item=content)

When our code exits the with block, the batch writer will send the data to DynamoDB.


Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Would you like to have a call and talk? Please schedule a meeting using this link.


Bartosz Mikulski
Bartosz Mikulski * data/machine learning engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group