## Big Data in soccer: Creating an xG model

Creation of a model of xG (expected goals) using Wyscout data to measure goal probability

## Introduction to Logistic Regression

Introduction Logistic regression is a statistical methodology that allows modeling the relationships between a binary categorical variable and a set of explanatory variables. Specifically, it models the probability that an observation belongs to one of the categories of that binary…

## Dynamic Task Mapping in Airflow 2.3.0

Introduction One of the most outstanding new features of Airflow 2.3.0 is Dynamic Task Mapping. This new feature adds the possibility of creating tasks dynamically at runtime. Thanks to this we can change the number of such tasks in our…

## Linear Programming and Simplex Method

Introduction We are all familiar with the concept of Linear Programming or Linear Optimization, that branch of mathematics that is dedicated to optimizing (maximizing or minimizing) a linear objective function subject to constraints in the form of equations and/or inequalities.…

## Data Governance using Apache Atlas

Today I would like to deal with a topic that, from my point of view, is very important and is probably the holy grail of data engineering projects. However, we rarely reach the necessary level of maturity to be able…

## Training and team building at Damavis

Last April, the management and human resources team at Damavis got down to work to organize a new meet-up to bring together all the members of the company once again. For this occasion, we changed our desks, computers and our…

## Deep Reinforcement Learning: DQN

Introduction In our previous post about Reinforcement Learning, we made an introduction to this area through one of its most popular techniques: Q-learning. We laid the groundwork by talking about Markov decision processes, policies and value functions, and we saw…

## Creating vector graphics with Python

When we deal with data and we want to plot them in graphs, sometimes we need to generate those graphs using a format that allows scaling to any resolution and without losing quality. If we also need some interactivity, being…

## Introduction to Jenkins: Building CI/CD Pipelines

Introduction DevOps is a set of practices that aim to streamline the software development lifecycle by coordinating the development (Dev) and operations (Ops) departments. Teams that have an integrated DevOps culture are able to continuously integrate and deliver software (CI/CD)…

## Using QGIS to find out which bank has the best coverage in Barcelona

Can we know which of the banks in Barcelona has the best coverage? In this post we will try to answer this question using geographic data from OpenStreetMaps and QGIS. First of all, let’s define what we will understand by…