Doing data quality checks using the SQLCheckOperator
SQLCheckOperator is an Airflow operator that executes a SQL query, expects to receive a single row in the response, and attempts to cast every value in the row to a boolean. It succeeds when all returned values can be cast to true, so the query may return those values:
-
a boolean
True
-
a non-zero numeric value (including negative values!)
-
a non-empty string
-
a non-empty list, set, or dictionary
In addition to failing when any of the values is False
, the SQLCheckOperator operator also fails when the query returns no rows.
For example, we can use that operator to check the count of values in a table:
1
2
3
4
5
from airflow.operators.sql import SQLCheckOperator
operator = SQLCheckOperator(
sql="SELECT COUNT(*) FROM some_table WHERE some_column='{{ yesterday_ds_nodash }}'"
)
Parsing machine learning logs with Ahana, a managed Presto service, and Cube, a headless BI solution

Check out my article published on the Cube.dev blog!
You may also like
- How to retrieve the statuses of the recent DAG executions from Airflow database
- Why does the DayOfWeekSensor exist in Airflow?
- How to check whether a YARN application has finished
- How to use Virtualenv to prepare a separate environment for Python function running in Airflow
- How to set Airflow variables while creating a dev environment
Bartosz Mikulski
- Data/MLOps engineer by day
- DevRel/copywriter by night
- Python and data engineering trainer
- Conference speaker
- Contributed a chapter to the book "97 Things Every Data Engineer Should Know"
- Twitter: @mikulskibartosz