Skip to main content
<
Page 5
>

PyTorch on Databricks - Introducing the Spark PyTorch Distributor

Background and Motives Deep Learning algorithms are complex and time consuming to train, but are quickly moving from the lab to production because...

Synthetic Data for Better Machine Learning

April 12, 2023 by Sean Owen in
You've likely tried the buzziest advances in generative AI in the past year, tools like ChatGPT and DALL-E . They consume complex data...

Saving Mothers with ML: How CareSource uses MLOps to Improve Healthcare in High-Risk Obstetrics

This blog post is in collaboration with Russ Scoville (Vice President of Enterprise Data Services), Arpit Gupta (Director of Predictive Analytics and Data...

Fine-Tuning Large Language Models with Hugging Face and DeepSpeed

March 20, 2023 by Sean Owen in
Large language models (LLMs) are currently in the spotlight following the sensational release of ChatGPT. Many are wondering how to take advantage of...

Unsupervised Outlier Detection on Databricks

Kakapo ( KAH-kə-poh ) implements a standard set of APIs for outlier detection at scale on Databricks. It provides an integration of the...

Accelerate your model development with the new MLflow Experiments UI

MLflow is the premier platform for model development and experimentation. Thousands of data scientists use MLflow Experiment Tracking every day to find the...

Getting started with NLP using Hugging Face transformers pipelines

February 6, 2023 by Paul Ogilvie and Maddie Dawson in
Advances in Natural Language Processing (NLP) have unlocked unprecedented opportunities for businesses to get value out of their text data. Natural Language Processing...

What’s New With SQL User-Defined Functions

Since their initial release , SQL user-defined functions have become hugely popular among both Databricks Runtime and Databricks SQL customers. This simple yet...

Building Geospatial Data Products

January 6, 2023 by Milos Colic in
Geospatial data has been driving innovation for centuries, through use of maps, cartography and more recently through digital content. For example, the oldest...

Accelerating SIEM Migrations With the SPL to PySpark Transpiler

December 16, 2022 by Serge Smertin and Jason Trost in
In this blog post, we introduce transpiler , a Databricks Labs open-source project that automates the translation of Splunk Search Processing Language (SPL)...