Category Data Engineering

Clean Code with Alpakka Kafka

Clean code with Alpakka Kafka

At Damavis we are very aware of the importance for our clients to have access to their data in real time. For this reason, one of our strengths is the development of tools and technologies that can move, transform and…

Introduction to Apache YARN

Introduction to Apache YARN

Note: the code of this post has been tested using Apache Hadoop 2.10.1. Please check out our previous post, Introduction to Apache Hadoop, to configure this version of Hadoop, in case you have not done it yet. As explained in…

Introduction to Apache Hadoop

Introduction to Apache Hadoop and getting started

Sometimes it might be a bit overwhelming to understand the role of the most common open source technologies used in big data contexts. For example, probably most of you have heard about tools such as Apache Hadoop, Apache Spark, Apache…

Pentaho PDI Plugin for Airflow

Pentaho PDI Plugin for Airflow

Schedule, orchestrate and monitor your Kettle tasks with Airflow with this Pentaho plugin. At Damavis we know the importance of data processing. Extracting, cleaning, transforming, aggregating, loading or cross-referencing multiple data sources allows our clients to have Insights or Predictive…