Milan Berka is a ML architect at DataSentics a.s. After he finished his mathematics and stochastics college degree, he started pursuing a career of a data scientist. However, soon it became clear that without a proper data infrastructure and data engineering element, it is very difficult to make a lasting impact with any data science model – regardless of how great the model itself is. Therefore, almost four years ago, he jumped over to “more engineering side” and started building experience in cloud infrastructure, big data frameworks, DevOps practices and other engineering topics. Combining the machine learning and engineering knowledge, his primary focus now is designing and building solutions which ease or even enable the productionalization of machine learning models (MLOps).
November 17, 2020 04:00 PM PT
Getting machine learning models to production is notoriously difficult: it involves multiple teams (data scientists, data and machine learning engineers, operations, ...), who often does not speak to each other very well; the model can be trained in one environment but then productionalized in completely different environment; it is not just about the code, but also about the data (features) and the model itself… At DataSentics, as a machine learning and cloud engineering studio, we see this struggle firsthand - on our internal projects and client's projects as well.
To address the issue, we decided to build a dedicated MLOps platform, which provides the necessary tooling, automations and standards to speed up and robustify the model productionalization process. The central piece of the puzzle is mlflow, the leading open-source model lifecycle management tool, around which we develop additional functionality and integrations to other systems - in our case primarily the Azure ecosystem (e.g. Azure Databricks, Azure DevOps or Azure Container Instances). Our key design goal is to reduce the time spent by everyone involved in the process of model productionalization to just a few minutes.
In this talk, we will discuss:
- How we think about the MLOps problematics in DataSentics in general, what are our real-life model productionalization experiences and how it affected building of our MLOps platform
- Demo of model deployment on our MLOps platform
- Lessons learned and next steps
Speaker: Milan Berka
October 15, 2019 05:00 PM PT
Moneta has repeatedly been recognized as the most innovative bank on the Czech market. This is due in large part to their strategy of completely shifting to the cloud and using data and advanced analytics to innovate the customer experience with use cases ranging from real-time recommendations to fraud detection.
In this talk, we’ll share how we migrated to the cloud to create an agile environment for analytics and AI. From rapid prototyping machine learning use cases to moving models into production, core to this approach was building a unified platform for data and analytics on Apache Spark, Databricks and AWS. Discussion topics include: