How to check when an Athena table was updated

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (64/100)

This article will show you how to get a notification when an Athena table is updated. For example, when someone writes a file to the S3 location that is used as the source of the Athena table.

To get the notification, I will implement an AWS Lambda function that gets triggered by a CloudWatch event. Here is the Serverless configuration that triggers the function:

1
2
3
4
5
6
7
8
9
10
events:
    - cloudwatchEvent:
        event:
        source:
            - "aws.glue"
        detail-type:
            - "Glue Data Catalog Database State Change"
        detail:
            typeOfChange:
            - "updateTable" 

In the Python function, I have to extract the event details to get the database and the table name:

1
2
3
4
5
6
7
def main(event, context):
    database_name = event['requestParameters']['databaseName']
    table_name = event['requestParameters']['tableInput']['name']
    number_of_records = event['requestParameters']['tableInput']['parameters']['recordCount']
    update_time = event['eventTime']

    # here you can store the last update time in DynamoDB, send a notification to a Slack channel, or do whatever you want

Would you like to help fight youth unemployment while getting mentoring experience?

Develhope is looking for tutors (part-time, freelancers) for their upcoming Data Engineer Courses.

The role of a tutor is to be the point of contact for students, guiding them throughout the 6-month learning program. The mentor supports learners through 1:1 meetings, giving feedback on assignments, and responding to messages in Discord channels—no live teaching sessions.

Expected availability: 15h/week. You can schedule the 1:1 sessions whenever you want, but the sessions must happen between 9 - 18 (9 am - 6 pm) CEST Monday-Friday.

Check out their job description.

(free advertisement, no affiliate links)


Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.


Bartosz Mikulski
Bartosz Mikulski * MLOps Engineer / data engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group

Subscribe to the newsletter and get access to my free email course on building trustworthy data pipelines.