Introduction Apache Airflow is a free workflow orchestration software, which are created through Python scripts, and can be monitored using its user interface. Some examples of workflows in which this tool could be used are the scheduling of ETL (Extract,…
Avoiding UDFs in Apache Spark, working in Damavis, and a guide to discovering which open-source tool is the right for your project
Review of some Apache Spark library functions and some practical examples avoiding UDFs
Advanced Airflow, creation of machine learning pipelines and artificial intelligence in supermarkets
Cross-DAG task and sensor dependencies with Airflow. How to solve problems related to data engineering complexity.
How to configure Apache YARN to execute parallel jobs
How to deploy the Apache Airflow process orchestrator on Kubernetes
Schedule, orchestrate and monitor your Kettle tasks with Airflow with this Pentaho plugin.