How Data Lakehouses Solve Common Issues With Data WarehousesFebruary 4, 2021 by Ryan Boyd in Engineering Blog Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...
Ray & MLflow: Taking Distributed Machine Learning Applications to ProductionFebruary 3, 2021 by Amog Kamsetty and Archit Kulkarni in Engineering Blog This is a guest blog from software engineers Amog Kamsetty and Archit Kulkarni of Anyscale and contributors to Ray.io In this blog post...
Strategies for Modernizing Investment Data PlatformsJanuary 29, 2021 by Ricardo Portilla in Engineering Blog The appetite for investment was at a historic high in 2020 for both individual and institutional investors. One study showed that "retail traders...
Burning Through Electronic Health Records in Real Time With SmolderJanuary 28, 2021 by Ryan DeCosmo and Frank Austin Nothaft in Engineering Blog Check out the solution accelerator to download the notebook referred throughout this blog. In previous blogs , we looked at two separate workflows...
How to Manage Python Dependencies in PySparkDecember 22, 2020 by Hyukjin Kwon in Engineering Blog Controlling the environment of an application is often challenging in a distributed computing environment - it is difficult to ensure all nodes have...