How to decode base64 to text in AWS Athena

This article is a part of my "100 data engineering tutorials in 100 days" challenge. (75/100)

Decoding base64 in AWS Athena requires two steps. First, we have to use the from_base64 function to get a binary representation of the decoded content. We don’t get text automatically because we can use base64 to encode as a string any binary data, for example, a picture.

Therefore, Athena does not know that the content is a string. Because of that, we have to use the from_utf8 function to convert the binary data into text.

The complete SQL looks like this:

1
select from_utf8( from_base64(base64_encoded_column) ) FROM some_table

One more thing. Never, ever say that base64 is encryption. It is not. That is just a method of representing binary data as a string, so you can send it to/from a REST API as a JSON object or include it in an URL.


Subscribe to the newsletter and join the free email course.


Remember to share on social media!
If you like this text, please share it on Facebook/Twitter/LinkedIn/Reddit or other social media.

If you want to contact me, send me a message on LinkedIn or Twitter.

Would you like to have a call and talk? Please schedule a meeting using this link.


Bartosz Mikulski
Bartosz Mikulski * MLOps Engineer / data engineer * conference speaker * co-founder of Software Craft Poznan & Poznan Scala User Group

Subscribe to the newsletter and get access to my free email course on building trustworthy data pipelines.