Pandas API With Spark Back End (Koalas)

Demo Type

Product Tutorial




What you’ll learn

Despite being one of the most popular frameworks for data analysis, pandas isn’t distributed and can’t process TB of data. Databricks solves this issue by allowing users to leverage pandas API while processing the data with Spark distributed engine. This demo shows you how to process big data using pandas API (previously known as Koalas).


To install the demo, get a free Databricks workspace and execute the following two commands in a Python notebook

%pip install dbdemos
import dbdemos

Dbdemos is a Python library that installs complete Databricks demos in your workspaces. Dbdemos will load and start notebooks, Delta Live Tables pipelines, clusters, Databricks SQL dashboards, warehouse models … See how to use dbdemos


Dbdemos is distributed as a GitHub project.

For more details, please view the GitHub file and follow the documentation.
Dbdemos is provided as is. See the 
License and Notice for more information.
Databricks does not offer official support for dbdemos and the associated assets.
For any issue, please open a ticket and the demo team will have a look on a best-effort basis. 


demo thumb


Build Your Chatbot With Dolly

demo thumb


Feature Store and Online Inference

demo thumb


MLOps — End-to-End Pipeline