Today, we're kicking off a new series: the Databricks Bi-Weekly Digest. Our goal with this digest is to summarize Spark related content, compiled by Databricks, for the community. It will cover Spark technical content: blog posts, meetups tech-talks, conference talks, and noteworthy news articles pertaining to Apache Spark.
Here’s what’s happened in the last two weeks:
- A hands on Tutorial how to use SparkR presented at the useR 2016 conference: SparkR Tutorials at useR 2016. Peruse through presentations and try the Notebooks in Databricks.
- Vote for Apache Spark 2.0 RC4 Update from Reynold Xin is underway.
- JIRAs closed:
- Want to be heard, want to share your thoughts? Please take our Databricks 2016 Apache Spark Survey and make a difference.
- Tim Hunter presented Combining Machine Learning Frameworks with Apache Spark at the Hadoop Summit.
- Databricks released Stanford CoreNLP wrapper for Apache Spark with an example notebook. Try it on Databricks.
- Joseph Bradley spoke at the NYC Spark Meetup: Distributed ML in Apache Spark Meetup Tech Talk. Learn about DataFrames in MLlib.
- A blog explained A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets. Find out when to use which API and why.
- Three Apache Spark related tech-talks presented at Bay Area Apache Spark Meetup Tech-Talks @ SAP. You can watch the video on YouTube.