How to automatically remove files from S3 using lifecycle rules defined in Terraform

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (39/100)

When we want to remove old files from S3 automatically, we use the lifecycle rules, but I don’t recommend setting them using the AWS Web Interface because, in my opinion, the whole infrastructure should be defined as code.

Thus, it is best to add a Terraform configuration for the bucket we want to clean.

We will need three things:

  • the name of the bucket
  • the key prefix of files we want to remove
  • the number of days after which we want to clean the data

When we have all of that, we can define the lifecycle rule in Terraform:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
resource "aws_s3_bucket" "bucket" {
    bucket = "bucket_name"
    acl = "private"

    lifecycle_rule {
        id = "remove_old_files"
        enabled = true

        prefix = "key_prefix/"

        expiration {
            days = 180
        }
    }
}

In this example, I configured the expiration rule that removes files older than 180 days from the bucket bucket_name, but it applies only to the files which keys start with the prefix key_prefix/.




Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Would you like to have a call and talk? Please schedule a meeting using this link.


Bartosz Mikulski
Bartosz Mikulski * data/machine learning engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group