As the largest data and machine learning conference, Spark + AI Summit brings together over 7,500 engineers, scientists, developers, analysts and leaders from around the world to San Francisco every year. Over four days, we shape the future of big data, analytics and AI as we share knowledge, hear from thought leaders and train on open-source technologies like Apache Spark™, Delta Lake, MLflow, Koalas, TensorFlow and PyTorch.
Associate Provost of Data Science and Information, and Dean of the School of Information at UC Berkeley
Professor, UC Berkeley; Appointment in Electrical Engineering & Computer Science, School of Information
Co-founder & Chief Technologist, Databricks Original Creator of Apache Spark™ & MLflow
Co-founder & Chief Architect, Databricks
Top Contributor & Original Creator of Apache Spark
Spark + AI Summit 2020 kicks off with pre-conference training workshops, including both instruction and hands-on classes. Apache Spark 2.x certification is also offered as an exam, with an optional half-day prep course. Training and certification are available as add-ons to the conference pass.
Half Day Courses
Full Day Courses
Group Pricing
Groups of 4 or more get 20% off. Please contact [email protected] for more information.
Half Day Prep course + Apache Spark™ Certification Exam (Tuesday)
Apache Spark™ Certification Exam (Wednesday)
Apache Spark™ Certification Exam (Thursday)
Half Day Courses
Full Day Courses
Group Pricing
Groups of 4 or more get 20% off. Please contact [email protected] for more information.
Half Day Prep course + Apache Spark™ Certification Exam (Tuesday)
Apache Spark™ Certification Exam (Wednesday)
Apache Spark™ Certification Exam (Thursday)
Conveniently located in the South of Market area, Moscone West provides easy access to downtown San Francisco’s many hotels and restaurants — providing opportunity to enjoy the city after the sessions close. Take advantage of easy transportation via BART, MUNI and CalTrain.
LEARN MORE + SEE HOTEL AND AIRFARE, CAR RENTAL DEALSApache Spark is a powerful open-source processing engine built around speed, ease of use, and sophisticated analytics. Spark began at UC, Berkeley in 2009, and it is now developed at the vendor-independent Apache Software Foundation. Since its initial release, Spark has seen rapid adoption by enterprises across wide-ranging industries. Internet powerhouses such as Facebook, Hotels.com, Cisco, Microsoft, and Netflix have deployed Spark at massive scale, processing multiple petabytes of data on clusters of more than 8,000 nodes. Apache Spark has also become the largest open-source community in big data, with more than 1,000 contributors from over 250 organizations. Learn more
Apache Spark™ Developers
Data and ML Engineers
Data Scientists
Infrastructure / Site Reliability Engineers
Researchers
Data Practitioners
Key Decision Makers
Business Executives
Excited to come to Spark+AI Summit, but need to convince your company to let you attend? We’ve prepared a letter for you. Download the template. (Word | PDF)
Data and AI need to be unified: the best AI applications require massive amounts of constantly updated training data to build state-of-the-art models. So far, Apache Spark™ is the only unified analytics engine that combines large-scale data processing with state-of-the-art machine learning and AI algorithms.
Combining Spark + AI topics, this conference is a unique “one-stop shop” for developers, data scientists, and tech executives seeking to apply the best tools in data and AI to build innovative products. Join more than 7,000 engineers, data scientists, AI experts, researchers, and business professionals for three days of in-depth learning and networking.
The sessions and training at this conference will cover data engineering and data science content, along with best practices for productionizing AI: keeping training data fresh with stream processing, quality monitoring, testing, and serving models at a massive scale. The conference will also include deep-dive sessions on popular software frameworks—e.g., Delta Lake, MLflow, TensorFlow, SciKit-Learn, Keras, PyTorch, DeepLearning4J, BigDL, and deep learning pipelines.