Fugue Tune: Distributed Hybrid Hyperparameter Tuning
- Data Science, Machine Learning and MLOps
- Moscone South | Upper Mezzanine | 156
- 35 min
Hyperparameter tuning is used in model development to search for optimal model parameters. Spark hyperparameter tuning has generally been done on memory-bound problems, where one dataset is split across different machines, and multiple models are trained in a sequential way. In this talk, we’ll explore how to use Apache Spark as an engine for parallelizing compute-bound tuning problems, where hundreds or thousands of smaller models are trained in parallel.
There are multiple approaches to hyperparameter tuning. Grid search is exploring a finite combination of values, while Bayesian Optimization is building over the last attempts to create a better hyperparameter combination. Approaches like grid search are trivially parallelizable, while Bayesian Optimization has a sequential dependency. But actually, we can combine these two ideas to parallelize a Grid of Bayesian Optimization trials over Spark. This will be done through Fugue-tune, a general interface that abstracts existing machine learning frameworks such as Optuna and Hyperopt, by providing a scalable interface on top of them.
In this talk, we'll explore how to tune a general ML objective on a hybrid search space at where model search, grid search, random search and Bayesian optimization are combined intuitively using Fugue-Tune's simple interface. Using Greykite as an example, we will demo tuning a forecasting model distributedly and monitoring the best result at realtime.