Apache Spark has 4 main open source cluster managers: Mesos, YARN, Standalone, and Kubernetes. Every cluster manager has its own unique requirements and differences. In order to support the scheduling engine in IBM Spectrum Conductor it required modifications to some core pieces of Spark. In this presentation we will walk you through the changes we worked with the community to allow Spark to be more pluggable. Which allowed us to avoid requiring modifying core Spark files to support the Enterprise Grid Orchestrator scheduling engine for every Spark version.
Kevin Doyle is the lead architect of IBM Spectrum Conductor at IBM, where he works with customers to deploy and manage all workloads; especially Spark and deep learning workloads to on-premise clusters. Kevin has been working on distributed computing, grid, cloud, and big data for the past five years with a focus on the management and lifecycle of workloads.