How to check when an Athena table was updated

This article will show you how to get a notification when an Athena table is updated. For example, when someone writes a file to the S3 location that is used as the source of the Athena table.

To get the notification, I will implement an AWS Lambda function that gets triggered by a CloudWatch event. Here is the Serverless configuration that triggers the function:

events:
    - cloudwatchEvent:
        event:
        source:
            - "aws.glue"
        detail-type:
            - "Glue Data Catalog Database State Change"
        detail:
            typeOfChange:
            - "updateTable"

In the Python function, I have to extract the event details to get the database and the table name:

def main(event, context):
    database_name = event['requestParameters']['databaseName']
    table_name = event['requestParameters']['tableInput']['name']
    number_of_records = event['requestParameters']['tableInput']['parameters']['recordCount']
    update_time = event['eventTime']

    # here you can store the last update time in DynamoDB, send a notification to a Slack channel, or do whatever you want
Older post

How to run an Airflow DAG in a loop

How to keep running an Airflow DAG indefinitely

Newer post

How to set Airflow variables while creating a dev environment

How to use command-line to set Airflow variables