Introduction to software testing
Software tests are tests that are performed on a piece of code or an entire software to validate that the behavior of the software is as expected. The main purpose of testing is to detect errors in the code in…
Software tests are tests that are performed on a piece of code or an entire software to validate that the behavior of the software is as expected. The main purpose of testing is to detect errors in the code in…
Apache Airflow is an open source tool for workflow orchestration widely used in the field of data engineering. You can take a look at this other blog post where we made an introduction to Basics on Apache Airflow. In this…
When reviewing code, we often spend time going over small bugs or stylistic details that distract us from what’s really important. In this post we will introduce pre-commit, a tool that attacks this problem by automatically correcting our code. What…
The idea of this article is to save some time for those interested in Elasticsearch and to share some useful concepts and resources. What is Elasticsearch? Elasticsearch is a free, open source, distributed search engine developed in Java capable of…
In the last few years, we have seen a great evolution in the Python programming language as it has gained popularity to become one of the most widely used programming languages. The involvement of the community with the development of…
What is MongoDB? MongoDB is an open source NOSQL database. This means that data does not necessarily have to follow a schema. All data is stored in a JSON similar document format known as BSON or Binary JSON and will…
Since our Pentaho PDI plugin for Apache Airflow release, we have seen an industry shift towards the usage of Apache Hop for data processing. What is Apache Hop? Apache Hop started (late 2019) as a fork of Kettle PDI, is…
Apache Spark is an open source framework that allows us to process large volumes of data in a distributed way. How? By dividing the large volumes of data, impossible to process in one machine, and distributing them among the different…
Apache Drools is a software used to manage business rules of any kind. As it is a very extensive framework, we limit this article to directly apply a simple use case with Scala, thus leaving aside the theoretical explanations that…
What is Apache Kafka? Apache Kafka is an open source distributed event system. It was originally developed by LinkedIn, in order to cover the needs caused by its rapid growth, and moved to a microservices-based infrastructure. It is also an…