Damavis Summary of week 24, 2021

The use of Window in Spark, how is it to work at Damavis and news about artificial intelligence in healthcare

The use of Window in Spark, how is it to work at Damavis and news about artificial intelligence in healthcare

When processing data we often find ourselves in a situation where we want to calculatevariables over certain subset of observations. For example, we might be interested in theaverage value per group or the maximum value for each group. groupBy and…

Avoiding UDFs in Apache Spark, working in Damavis, and a guide to discovering which open-source tool is the right for your project
In the world of Data Engineering, it is well known that the use of UDFs (User Defined Functions) in Apache Spark (especially with the Python API) can compromise our application performace. For this reason, at Damavis we try to avoid…

Definitive guide to configure the Pyspark development environment in Pycharm; one of the most complete options. Spark has become the Big Data tool par excellence, helping us to process large volumes of data in a simplified, clustered and fault-tolerant way.…