How to make sure that you did not leave an EMR cluster running

How many times do you wonder whether you switched off the light before leaving the house? What about terminating the EMR cluster? Is it still running? Are you paying for something you are not using?

This article will show you how to create a script that periodically checks whether the EMR cluster is running and displays a notification in macOS. I will use osascript, so this script does not work on Windows and Linux!

The first thing we have to do is defining the naming convention. In my team, we assume that a personal dev cluster must have the owner’s username in its name. On macOS, we can get the username using the whoami command. We will use that later.

Before proceeding, you should also install and configure AWS CLI.

Script to check whether a cluster is running

This section will show you a few lines of code and explain how they work. In the end, I will post the whole script.

First, we have to retrieve the list of all running clusters that have the username of the current user in its name:

running_clusters=$(aws emr list-clusters --active | tail -n +2 | grep $(whoami))

After that, we must count the number of lines in the output:

number_of_lines=$(echo $running_clusters | sed '/^\s*$/d' | wc -l)

If there is at least one line, we have a running EMR cluster. In that case, we run an osascript that displays a notification:

if (( $number_of_lines > 0 )); then
    message="You have an EMR cluster running"
    script="'display notification \"$message\" with title \"EMR cluster\"'"
    eval "osascript -e $script"
fi

Here is the complete script:

#!/bin/bash

running_clusters=$(aws emr list-clusters --active | tail -n +2 | grep $(whoami))
number_of_lines=$(echo $running_clusters | sed '/^\s*$/d' | wc -l)
if (( $number_of_lines > 0 )); then
    message="You have $(echo $number_of_lines | tr -d ' ') cluster running"
    script="'display notification \"$message\" with title \"Dev clusters\"'"
    eval "osascript -e $script"
fi

Now, you have to make this file executable and try to run it. If you have an EMR cluster running which name matches the naming convention, you should see a notification.

Use CRON to run the script

In the final step, we have to add the script to the crontab and run it periodically. I like to run it on the 25th and 55th minute of every hour:

sudo crontab -u $(whoami) -e

In the crontab, you should add a new line which contains this:

25,55 * * * * ~/script_path/script_name.sh
Older post

How to automatically remove files from S3 using lifecycle rules defined in Terraform

How to define S3 lifecycle rules using Terraform

Newer post

How to download all available values from DynamoDB using pagination

How to use pagination to retrieve all DynamoDB values