HomepageData + AI Summit 2023 Logo
SAN FRANCISCO, JUNE 26-29
VIRTUAL, JUNE 28-29
  • Sessions
Watch on demand

Data Caching Strategies for Data Analytics and AI

Wednesday, June 28 @3:30 PM
Attending in person? Add to your schedule ↗

Overview

The increasing popularity of data analytics and artificial intelligence (AI) has led to a dramatic increase in the volume of data being used in these fields, creating a growing need for an enhanced computational capability. Cache plays a crucial role as an accelerator for data and AI computations, but it is important to note that these domains have different data access patterns, requiring different cache strategies. In this session, you will see our observations on data access patterns in the analytical SQL and AI training domains based on practical experience with large-scale systems. We will discuss the evaluation results of various caching strategies for analytical SQL and AI and provide caching recommendations for different use cases. Over the years, we have learned some best practices from big internet companies about the following aspects of our journey:




  1. Traffic pattern for analytical SQL and cache strategy recommendation

  2. Traffic pattern for AI training and how we can measure the cache efficiency for different AI training process

  3. Cache capacity planning based on real-time metrics of the working set

  4. Adaptive caching admission and eviction for uncertain traffic patterns


Type

  • Breakout

Experience

  • In Person

Track

  • Research

Industry

  • Enterprise Technology, Media and Entertainment

Difficulty

  • Intermediate

Duration

  • 40 min
Download session slides

Session Speakers

Headshot of Beinan Wang

Beinan Wang

Senior Staff Software Engineer

Alluxio

Headshot of Chunxu Tang

Chunxu Tang

Research Scientist

Alluxio

Don't miss this year's event!

Register now