Skip to main content
<
Page 3
>
Engineering blog

Synthetic Data for Better Machine Learning

April 12, 2023 by Sean Owen in Engineering Blog
You've likely tried the buzziest advances in generative AI in the past year, tools like ChatGPT and DALL-E . They consume complex data...
Engineering blog

Saving Mothers with ML: How CareSource uses MLOps to Improve Healthcare in High-Risk Obstetrics

This blog post is in collaboration with Russ Scoville (Vice President of Enterprise Data Services), Arpit Gupta (Director of Predictive Analytics and Data...
Engineering blog

Fine-Tuning Large Language Models with Hugging Face and DeepSpeed

March 20, 2023 by Sean Owen in Engineering Blog
Large language models (LLMs) are currently in the spotlight following the sensational release of ChatGPT. Many are wondering how to take advantage of...
Engineering blog

Unsupervised Outlier Detection on Databricks

Kakapo ( KAH-kə-poh ) implements a standard set of APIs for outlier detection at scale on Databricks. It provides an integration of the...
Engineering blog

Accelerate your model development with the new MLflow Experiments UI

MLflow is the premier platform for model development and experimentation. Thousands of data scientists use MLflow Experiment Tracking every day to find the...
Engineering blog

Getting started with NLP using Hugging Face transformers pipelines

February 6, 2023 by Paul Ogilvie and Maddie Dawson in Engineering Blog
Advances in Natural Language Processing (NLP) have unlocked unprecedented opportunities for businesses to get value out of their text data. Natural Language Processing...
Engineering blog

What’s New With SQL User-Defined Functions

Since their initial release , SQL user-defined functions have become hugely popular among both Databricks Runtime and Databricks SQL customers. This simple yet...
Engineering blog

Building Geospatial Data Products

January 6, 2023 by Milos Colic in Engineering Blog
Geospatial data has been driving innovation for centuries, through use of maps, cartography and more recently through digital content. For example, the oldest...
Engineering blog

Accelerating SIEM Migrations With the SPL to PySpark Transpiler

December 16, 2022 by Serge Smertin and Jason Trost in Engineering Blog
In this blog post, we introduce transpiler , a Databricks Labs open-source project that automates the translation of Splunk Search Processing Language (SPL)...
Engineering blog

Spatial Analytics at Any Scale With H3 and Photon

H3's global grid indexing system is driving new patterns for spatial analytics across a variety of geospatial use-cases. Recently, Databricks added built-in support...