Skip to main content
<
Page 91
>

Building a Cybersecurity Lakehouse for CrowdStrike Falcon Events Part II

Visibility is critical when it comes to cyber defense – you can't defend what you can't see. In the context of a modern...

Scanning for Arbitrary Code in Databricks Workspace With Improved Search and Audit Logs

How can we tell whether our users are using a compromised library? How do we know whether our users are using that API...

Disaster Recovery Automation and Tooling for a Databricks Workspace

This post is a continuation of the Disaster Recovery Overview, Strategies, and Assessment blog. Introduction A broad ecosystem of tooling exists to implement...

Mosaic ResNet Deep Dive

July 18, 2022 by Matthew Leavitt in
TL;DR: We recently released a set of recipes which can accelerate training of a ResNet-50 on ImageNet by up to 7x over standard...

Hunting for IOCs Without Knowing Table Names or Field Labels

There is a breach! You are an infosec incident responder and you get called in to investigate. You show up and start asking...

Using Spark Structured Streaming to Scale Your Analytics

This is a guest post from the M Science Data Science & Engineering Team. Modern data doesn't stop growing "Engineers are taught by...

6 Guiding Principles to Build an Effective Data Lakehouse

In this blog post, we will discuss some guiding principles to help you build a highly-effective and efficient data lakehouse that delivers on...

Databricks Ventures Invests in Tecton: An Enterprise Feature Platform for the Lakehouse

July 12, 2022 by Andrew Ferguson in
Operational machine learning, which involves applying machine learning to customer-facing applications or business operations, requires solving complex data problems. Data teams need to...

Using Airbyte for Unified Data Integration Into Databricks

July 11, 2022 by Fei Lang and Simon Späti in
This is a collaborative post from Databricks and Airbyte. We thank Simon Späti, Data Engineer & Technical Author at Airbyte, for their contributions...

Introducing Spark Connect - The Power of Apache Spark, Everywhere

At last week's Data and AI Summit, we highlighted a new project called Spark Connect in the opening keynote. This blog post walks...