HomepageData + AI Summit 2023 Logo
SAN FRANCISCO, JUNE 26-29
VIRTUAL, JUNE 28-29
  • Sessions
Watch on demand

Deep Dive Into Grammarly's Data Platform

Wednesday, June 28 @11:30 AM
Attending in person? Add to your schedule ↗

Overview

Grammarly helps 30 million people and 50,000 teams to communicate more effectively. Using the Databricks Lakehouse Platform, we can rapidly ingest, transform, aggregate, and query complex data sets from an ecosystem of sources, all governed by Unity Catalog. This session will overview Grammarly’s data platform and the decisions that shaped the implementation. We will dive deep into some architectural challenges the Grammarly Data Platform team overcame as we developed a self-service framework for incremental event processing.



 



Our investment in the lakehouse and Unity Catalog has dramatically improved the speed of our data value chain: making 5 billion events (ingested, aggregated, de-identified, and governed) available to stakeholders (data scientists, business analysts, sales, marketing) and downstream services (feature store, reporting/dashboards, customer support, operations) available within 15. As a result, we have improved our query cost performance (110% faster at 10% the cost) compared to our legacy system on AWS EMR.



 



I will share architecture diagrams, their implications at scale, code samples, and problems solved and to be solved in a technology-focused discussion about Grammarly’s iterative lakehouse data platform.


Type

  • Breakout

Experience

  • In Person

Track

  • Data Lakehouse Architecture, Databricks Experience (DBX)

Industry

  • Professional Services

Difficulty

  • Intermediate

Duration

  • 40 min

Session Speakers

Headshot of Christopher Locklin

Christopher Locklin

Engineer Manager, Data Platform

Grammarly

Headshot of Faraz Yasrobi

Faraz Yasrobi

Software Engineer

Grammarly

Don't miss this year's event!

Register now