Testing in Apache Airflow

Today we are going to talk about two ways of testing in Apache Airflow. Historically, testing in Airflow has been something that has been a headache for all users of the famous framework. The coupling of the code with the…
Today we are going to talk about two ways of testing in Apache Airflow. Historically, testing in Airflow has been something that has been a headache for all users of the famous framework. The coupling of the code with the…
In this post we are going to talk about how DBT integrates with Spark and how this integration can be useful for us. DBT is a framework that facilitates the design of data modeling throughout the different data modeling cycles.…
Today I would like to deal with a topic that, from my point of view, is very important and is probably the holy grail of data engineering projects. However, we rarely reach the necessary level of maturity to be able…
Definitive guide to configure the Pyspark development environment in Pycharm; one of the most complete options. Spark has become the Big Data tool par excellence, helping us to process large volumes of data in a simplified, clustered and fault-tolerant way.…