Rafi Kurlansik

Senior Solutions Architect, Databricks

Rafi is a Sr. Solutions Architect at Databricks where he specializes in enabling customers to scale their R workloads with Spark. He is also the primary author of the R User Guide to Databricks and the bricksteR package. In his spare time he enjoys gardening with native plants, cooking up a storm, and long video game sessions with his three children.

Past sessions

Summit 2021 Learn to Use Databricks for the Full ML Lifecycle

May 27, 2021 03:15 PM PT

Machine learning development brings many new complexities beyond the traditional software development lifecycle. Unlike traditional software development, ML developers want to try multiple algorithms, tools and parameters to get the best results, and they need to track this information to reproduce work. In addition, developers need to use many distinct systems to productionize models. In this talk, learn how to operationalize ML across the full lifecycle with Databricks Machine Learning.

In this session watch:
Rafi Kurlansik, Senior Solutions Architect, Databricks


Historically it has been challenging for R developers to build and share data products that use Apache Spark. In this talk, learn how you can publish Shiny apps that leverage the scale and speed of Databricks, Spark and Delta Lake, so your stakeholders can better leverage insights from your data in their decision making. They will walk through how to decouple a Shiny app from a Spark cluster without losing the ability to query billions of rows with Delta Lake. Learn how to safely promote models from development to production with the MLflow Model Registry on Databricks. By tracking model experimentation with MLflow and managing the lifecycle with the Registry, organizations can improve reproducibility and governance when publishing artifacts to RStudio Connect for batch or online scoring with Shiny or Plumber APIs.

Sample of topics discussed:

  • The best way to leverage Spark for a Shiny app, and how to make that Shiny app reliably available to your decision makers
  • Benchmarking performance of connecting to Spark from Shiny natively or via JDBC/ODBC
  • Programmatically managing models trained on Databricks with the MLflow Model Registry Exploring different serving patterns for MLflow models with R