How to find and terminate an idle Redshift session

Redshift has a limit of 500 connections. If we use Redshift as a data source of a user-facing application, we can easily exceed that limit. In this case, we end up with a terrible user experience and hard to debug problems (I bet that exceeded connection limit is not going to be the first possible cause we will look at while debugging).

In this article, I show how to find idle Redshift sessions and how to terminate them. First, I will show you the full query that returns only the idle sessions. In the second part of the article, I will explain it step-by-step.

How to get idle user sessions

SELECT
    sessions.process AS process_id,
    trim(sessions.user_name) AS user_name,
    sessions.starttime AS session_start_time,
    datediff(s, sessions.starttime, sysdate) AS session_duration_is_seconds,
    queries.last_end AS last_query_end_time,
    datediff(s, CASE WHEN queries.last_end IS NOT NULL
        THEN queries.last_end
        ELSE sessions.starttime END
    , sysdate) AS idle_duration_in_seconds
FROM
        (SELECT starttime, process, u.usesysid, user_name
        FROM stv_sessions s, pg_user u
        WHERE s.user_name = u.usename
        AND u.usesysid>1
        AND process NOT IN (
            SELECT pid FROM stv_inflight WHERE userid>1
            UNION SELECT pid FROM stv_recents WHERE status != 'Done' AND userid>1
        )) sessions
        LEFT OUTER JOIN (
            SELECT userid, pid, max(endtime) AS last_end FROM svl_statementtext
            WHERE userid>1 AND sequence=0
            GROUP BY 1,2) queries
            ON sessions.usesysid = queries.userid AND sessions.process = queries.pid
        WHERE (queries.last_end > sessions.starttime OR queries.last_end IS NULL)
        ORDER BY idle_duration_in_seconds DESC;

How to terminate a session

To terminate an active user session, we need the process_id that identifies the session. When we have the process id, we can use the pg_terminate_backend( process_id ) function to disconnect the user from Redshift.

How does it work?

Find active sessions

We start by retrieving all active user sessions. For that, we can use the stv_sessions Redshift table:

SELECT starttime, process, u.usesysid, user_name
FROM stv_sessions s, pg_user u
WHERE s.user_name = u.usename
AND u.usesysid > 1

Find active queries

We are not interested in sessions of users who are currently executing a query, so we filter out in-flight queries and recently finished queries. Hence, the process NOT IN statement that removes the sessions found by this query:

SELECT pid FROM stv_inflight WHERE userid>1
UNION
SELECT pid FROM stv_recents WHERE status != 'Done' AND userid>1

Which session is idle?

To figure out which sessions are idle, we need the end time of the last statement executed using those sessions. We can get this information from the slv_statementtext table:

SELECT userid, pid, max(endtime) AS last_end
FROM svl_statementtext
WHERE userid>1 AND sequence=0
GROUP BY 1,2
Older post

How to configure Spark to maximize resource usage while using AWS EMR

How to configure EMR to use all available resources when running a Spark cluster

Newer post

How to retrieve the statuses of the recent DAG executions from Airflow database

How to make a dashboard that displays Airflow DAG statuses