On May 4th, we hosted a live webinar — Deep Learning and Apache Spark: Workflows and Best Practices. Rather than comparing deep learning systems or specific optimizations, this webinar focused on issues that are common to deep learning frameworks when running on an Apache Spark cluster, including:
- how to optimize cluster setup;
- how to ideally configure the Spark cluster;
- how to ingest data; and
- how to monitor long-running jobs?
Recording and Slides
Toward the end, we held a Q&A, and below are all the questions with links to the forum with their answers. (Follow the link to view the answers.)
- How can I become an expert in Apache Spark?
- What is the best way to integrate Spark into Web application? Will it make any difference if we use Spark concepts in a web application
- Are there plans to integrate the Firmament cluster scheduler (http://firmament.io/) into Databricks in the future
- Is there a docker image Spark with GPU available to try on?
- Which Spark version is required to work with Deep Learning frameworks
- Is TensorFrame available for open source version on Spark or do we need to use databricks product?
- PySpark seems the good approach for DL and Spark at this moment. But for monitoring, Scala sounds like a better choice. What is your thought about which should be the language that should be used more in DL in the future?
- Where can I see a notebook example of tensorFrames
- Are you offering any discount code for the Spark Summit to the attendees?
- Do you have any thoughts on combining Spark with Tensorflow (e.g. frameworks) vs integrated Tensorflow Distributed (via clusterspec)?
- Are some limitations mentioned in the webinar peculiar to running Spark on Databricks or also apply to Spark running on outside of Databricks? If yes, is there a way Databricks is coming up with ways to mitigate limitations?
- Thanks for the great talk. How can we access the slides?
- How is nvidia-docker image different from Databricks stack vs PaperScale stack?
- Does it [Databricks] support H2o.ai based deep learning i.e. sparkling water and deep water
- Can the users specify the [image] filters used in the filtering process without using the default filters? If yes, how can we specify the filters
If you’d like to perform deep learning on Databricks, start your 14-day free trial today.