Leveraging AI to drive growth and innovation
How to use AWSAthenaOperator in Airflow to verify that a DAG finished successfully
How to check that an AWS Athena table contains data after running an Airflow DAG.
12 Oct 2020
How to start an AWS Glue Crawler to refresh Athena tables using boto3
How to create and start an AWS Glue Crawler from Python code using boto3
11 Oct 2020
How to retrieve the table descriptions from Glue Data Catalog using boto3
How to get the comments from the create table statements when the metadata is stored in the Glue Data Catalog
10 Oct 2020
How to perform a batch write to DynamoDB using boto3
How to write multiple DynamoDB objects at once using boto3
09 Oct 2020
How to populate a PostgreSQL (RDS) database with data from CSV files stored in AWS S3
How to upload S3 data into RDS tables
08 Oct 2020
How to concatenate multiple MySQL rows into a single field?
How to concatenate multiple rows into a string in MySQL
07 Oct 2020
How to get an array/bag of elements from the Hive group by operator?
How to get an array of elements from one column when grouping by another column in Hive
06 Oct 2020
Working with dates and time in Apache Spark
How to get relative dates (yesterday, tomorrow) in Apache Spark, and how to calculate the difference between two dates
05 Oct 2020
How to save an Apache Spark DataFrame as a dynamically partitioned table in Hive
How to use the saveAsTable function to create a partitioned table
04 Oct 2020
When to cache an Apache Spark DataFrame?
Should we cache everything in Apache Spark or are there any rules?
03 Oct 2020
How to flatten a struct in a Spark DataFrame?
How to convert struct fields into separate columns.
02 Oct 2020
What is the difference between CUBE and ROLLUP and how to use it in Apache Spark?
Desc: How to use the cube and rollup functions in Apache Spark or PySpark. What is the difference between a cube and a rollup.
01 Oct 2020