By default, Hive does not support passing the columns to GROUP BY or ORDER BY using their positions instead of names. Therefore, when I try to run this query:

SELECT column_A, count(*) FROM table_name GROUP BY 1

I will see an error message telling me that the column_A is not in the GROUP BY expression.

To get the behavior I want, I have to enable the position alias usage before running my query. I can do that by setting the hive.groupby.orderby.position.alias property to true:

SET hive.groupby.orderby.position.alias=true;

SELECT column_A, count(*) FROM table_name GROUP BY 1

Now, Hive is going to correctly recoginize that I want to group by the first column, lookup its name and use that to execute the query.

Older post

How to check whether a regular expression matches a string in Hive

What is the equivalent of Athena/Presto regexp_like in Hive

Newer post

How to define an AWS Athena view using Airflow

How to use the AWSAthenaOperator