How to check when an Athena table was updated

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (64/100)

This article will show you how to get a notification when an Athena table is updated. For example, when someone writes a file to the S3 location that is used as the source of the Athena table.

To get the notification, I will implement an AWS Lambda function that gets triggered by a CloudWatch event. Here is the Serverless configuration that triggers the function:

1
2
3
4
5
6
7
8
9
10
events:
    - cloudwatchEvent:
        event:
        source:
            - "aws.glue"
        detail-type:
            - "Glue Data Catalog Database State Change"
        detail:
            typeOfChange:
            - "updateTable" 

In the Python function, I have to extract the event details to get the database and the table name:

1
2
3
4
5
6
7
def main(event, context):
    database_name = event['requestParameters']['databaseName']
    table_name = event['requestParameters']['tableInput']['name']
    number_of_records = event['requestParameters']['tableInput']['parameters']['recordCount']
    update_time = event['eventTime']

    # here you can store the last update time in DynamoDB, send a notification to a Slack channel, or do whatever you want



Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Would you like to have a call and talk? Please schedule a meeting using this link.


Bartosz Mikulski
Bartosz Mikulski * data/machine learning engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group


This website DOES NOT use cookies
but you may still see the cookies set earlier if you have already visited it.