## Top libraries for Data Science and Machine Learning

Data science is an interdisciplinary area that deals with extracting knowledge and useful information from data. Given the large amount of data generated by companies, it is important for data scientists to have tools that facilitate the manipulation, analysis and…

## Linear Regression with Python

Regression is a statistical methodology that describes the relationships between a continuous explained variable and a set of explanatory variables. In other words, regression models are able to predict the value of a dependent variable y with respect to a…

## Granger Causality: Time series causalities

In the world of data science and analysis, time series are a key element, as they represent a very natural source of information: values of a magnitude at different points in time. This is why understanding their properties and knowing…

## Natural Language Processing (NLP) with Python

## Big Data in soccer: Creating an xG model

Creation of a model of xG (expected goals) using Wyscout data to measure goal probability

## Introduction to Logistic Regression

Logistic regression is a statistical methodology that allows modeling the relationships between a binary categorical variable and a set of explanatory variables. Specifically, it models the probability that an observation belongs to one of the categories of that binary variable.…

## Detection of human poses through Deep Learning

Human pose detection is a quite relevant task in the field of computer vision, which consists of identifying the pose of a human figure from an image. This pose is defined from a series of key points, usually joints, so…

## Optimal Pricing in a hotel demand model

Revenue management (RM) can be defined as a set of techniques focused on analyzing consumer behavior with the purpose of obtaining the highest possible profit. In general, understanding how customers’ willingness to buy a certain good responds to this good’s…

## The use of Window in Apache Spark

When processing data we often find ourselves in a situation where we want to calculatevariables over certain subset of observations. For example, we might be interested in theaverage value per group or the maximum value for each group. The groupBy…