Databricks COVID-19 Resource Hub

Mobilizing the data community to help solve one of the world's toughest problems

COVID-19 is arguably the world’s toughest problem today. We believe this is true of the virus itself, but also the many issues related to mitigating its impact - from drug discovery, to testing, to patient care, resource allocation, and so much more. At Databricks, our hearts go out to those most affected by these challenges and we’re inspired by the people who are working at the front lines to address them, sacrificing their own safety to ensure the safety of others.

We also feel a sense of urgency for how Databricks and the entire data community can help. We are keenly aware that these problems can’t be solved effectively without data, nor will they be solved by a single organization. As such, we view it as our responsibility to ensure the right data is broadly available and actionable so data teams around the globe can do their best work.

This page provides an up to date view of how we’re enabling the data community to participate in addressing the challenges associated with COVID-19, examples of how different organizations are using our platform and expertise, as well as some of the partnerships we’ve joined to combat this crisis. It also contains links to blogs and other resources to keep you informed on what we’re doing.

Community enablement

We’re collecting and regularly updating a broad range of COVID-19 datasets for anyone to analyze and explore on Databricks or Databricks Community Edition.

COVID-19 datasets on Databricks

Workshop Part 1: Intro to Python on Databricks

Workshop Part 2: Data analysis with pandas

Workshop Part 3: Machine Learning with scikit-learn

Workshop Part 4: Intro to Apache Spark

Databricks Community Edition

SPARK + AI SUMMIT - Hackathon for social good

Using Machine Learning to Optimize COVID-19 Predictions

The Complexities Around COVID-19 Data

Improving Public Health Surveillance During COVID-19 with Data Analytics and AI

South Korea COVID-19 Dashboard

Platform use cases

We’re working very closely with many leading healthcare providers, research institutions and pharmaceutical companies on addressing a variety of COVID-related challenges using the Databricks platform. For more detail on these or to share other use cases, please contact us.

Spread prediction modeling and resource allocation

Partnering with healthcare systems to predict infection rates by analyzing hospital admission records and population health data. These insights help healthcare providers reallocate resources or relocate patients across facilities to meet fluctuating needs.

Patient tracking and management

Partnering with healthcare providers to analyze EHR and hospital data to track infected patient flows. This helps improve capacity management as well as determine which staff need to be monitored, tested and potentially quarantined based on their level of interaction with infected patients.

Accelerated clinical trials

Partnering with pharmaceutical organizations to leverage machine learning to assess the impact of COVID-19 on clinical trials.

Disaster modeling and response

Partnering with public agencies to predict when infection rates will peak and the ability of local public services to respond to critical health needs.

Viral genetics

Partnering with public genomics consortia to provide informatics support for clustering and analysis of large collections of viral genomes to identify genomic features that drive disease severity and spread.

Real-time disease detection

Partnering with manufacturers of health wearables to analyze streaming IoT data (e.g. patient temperature) to identify at risk and infected patients. This is useful for predicting outbreaks as well as implementing preventative measures that improve patient outcomes.


We’ve joined forces with other leading organizations to raise money and provide technical expertise to those who need it most.

COVID-19 Coronavirus regional response fund

We are joining leading technology companies like ServiceNow, Twilio, Salesforce and others in an effort to raise $22M for relief efforts.

Additional resources

We’re regularly publishing blogs, tech talks, and other resources to help keep customers, partners and the data community informed.

Contact us