Category Data Science

Linear Regression with Python

Linear Regression with Python

Regression is a statistical methodology that describes the relationships between a continuous explained variable and a set of explanatory variables. In other words, regression models are able to predict the value of a dependent variable y with respect to a…

Introduction to Logistic Regression

Introduction to Logistic Regression

Logistic regression is a statistical methodology that allows modeling the relationships between a binary categorical variable and a set of explanatory variables. Specifically, it models the probability that an observation belongs to one of the categories of that binary variable.…

Optimal Pricing in a hotel demand model

Revenue management (RM) can be defined as a set of techniques focused on analyzing consumer behavior with the purpose of obtaining the highest possible profit. In general, understanding how customers’ willingness to buy a certain good responds to this good’s…

The use of Window in Apache Spark

The use of Window in Apache Spark

When processing data we often find ourselves in a situation where we want to calculatevariables over certain subset of observations. For example, we might be interested in theaverage value per group or the maximum value for each group. The groupBy…