Skip to main content
KuKu FM

CUSTOMER
STORY

Personalizing audio and video storytelling at scale

10x

Growth in users

24x

Faster time to insights to power customer experiences

25%

Reduction in operational costs

Woman in yellow sweater enjoying music with headphones

KuKu is revolutionizing how millions of people consume content. As one of India’s fastest-growing streaming platforms, KuKu produces and distributes audio content and mobile-first microdramas in multiple languages and genres. Facing rapid user growth, KuKu needed a data platform that could scale cost-effectively while delivering faster insights to optimize customer experiences. By migrating to the Databricks Data Intelligence Platform on Google Cloud, KuKu transformed into a data-first organization. The company now powers 100+ dashboards, processes streaming events in near real time, trains sophisticated machine learning (ML) recommendation models and delivers critical business insights in under 2 hours, down from 1–2 days, all while reducing analytics costs per user by two-thirds.

Constrained by infrastructure that can’t keep pace

As a fast-growing digital content platform, KuKu’s mission is to deliver personalized, engaging experiences to every user through smarter, data-driven insights. But rapid growth exposed limits in their data infrastructure. Data infrastructure costs and complexities were also growing with their scaling user base. “Getting critical insights about our business could take one to two days,” Vishal Sarkar, Head of Analytics at KuKu, said. “If we noticed a high failure rate in a payment gateway or needed to optimize a marketing campaign, we simply couldn’t respond fast enough.”

The company’s analytics needs were complex and diverse. Over 100 dashboards served stakeholders across product, business and content teams, tracking everything from new user acquisition and monetization funnels to content engagement by genre and demographics. Average listening time, retention rates and subscription renewal metrics all required near real-time visibility for informed decisions. Additionally, KuKu was building in-house ML models to power personalized content recommendations, demanding robust data pipelines and flexible compute resources.

As the business grew, the team faced additional challenges around governance and access controls. “We needed fine-grained control over who could access what data, especially sensitive customer information,” Vishal said. With clickstream data flowing through Firebase and multiple pipelines serving various stakeholders, the risk of accidental data corruption or unauthorized access grew alongside the business. At the same time, they needed ways to control infrastructure costs and better visibility as data volumes continued to scale.

Supporting user growth with data intelligence

KuKu’s migration to the Databricks Data Intelligence Platform on Google Cloud marked a fundamental shift in how the company approaches data and analytics. The decision to run Databricks on Google Cloud was strategic. Clickstream data already flowed through Firebase and BigQuery, and data was already stored in Google Cloud Storage (GCS). Staying within the Google ecosystem enabled seamless integration and minimized data transfer costs and latency. “The data was already present on Google Cloud, so while choosing to migrate to Databricks, it was an easier option,” Vikas Goyal, CTO and co-founder at KuKu, said. Together, Databricks and Google Cloud provide KuKu with the speed and flexibility to process massive volumes of streaming user data in real time.

Today, over 100 dashboards powered by Databricks serve stakeholders across product, business and content teams, tracking everything from user acquisition and monetization to content engagement and retention. To support this scale, the company implemented a medallion architecture using Delta Lake to process streaming user event data through Bronze, Silver and Gold layers of data. Raw, immutable data is ingested at the Bronze layer, then cleaned, transformed, validated and de-duplicated in the Silver layer, before being curated into analytics-ready formats in the Gold layer. This multilayered approach ensures data quality and integrity while supporting real-time event processing and batch workloads. Data is stored as cost-effective Parquet files in GCS, with Delta Lake providing the transactional layer for reliability and performance.

Democratizing insights to improve user experiences

To serve the 100+ dashboards built on Apache Superset, KuKu leverages Databricks SQL with serverless compute on Google Cloud. The near-instant startup times and automatic scaling provide the speed and concurrency needed for business teams to run complex queries during peak usage hours. “Databricks SQL serverless offers a powerful combination of performance, scalability and cost efficiency,” Vishal explained. “Together with Google Cloud, it allows us to deliver insights in near real time while keeping costs under control.”

Unity Catalog transformed how KuKu manages data governance and access controls. It provides clear visibility into how data is managed across KuKu, with fine-grained permissions at the table and view levels. This centralized governance ensures sensitive customer data remains protected while allowing appropriate access for analytics and machine learning workloads. “Unity Catalog gives us control over data management and access levels,” Vishal stated. “We can prevent accidental data corruption while ensuring teams have the data they need.”

Streamlining workflows to drive personalization

KuKu uses Lakeflow Jobs for workflow automation and orchestration of complex data processing tasks. This allows KuKu to coordinate multiple jobs within larger workflows, managing dependencies and scheduling for extract, transform, load (ETL) pipelines and model training. MLflow handles the complete machine learning lifecycle, providing experiment tracking, model versioning and deployment capabilities. The systematic approach ensures reproducibility and allows the data science team to iterate quickly on recommendation models that drive user engagement.

Databricks Assistant has been one of the most impactful tools. “The in-platform AI assistant helps with data engineering tasks, like generating and optimizing SQL queries for table joins,” Vishal said. “We can put Spark logs into Databricks Assistant and get insights on what can be optimized, reducing shuffling between tables or adjusting how Delta tables are stored.” With Google Cloud’s infrastructure and Databricks’ intelligence, KuKu can rapidly turn insights into better experiences for every listener.

From days to hours: Real-time BI drives user growth

The migration to Databricks on Google Cloud has been transformative for KuKu. The company reduced the time to critical insights from 1–2 days to less than 2 hours, a game-changing improvement for a fast-moving digital platform. This acceleration impacts multiple use cases, from monitoring payment gateways to optimizing marketing campaigns. “If there’s a high failure rate in a payment gateway, we can identify it and redirect traffic immediately,” Vishal said. “Before Databricks, this would take a day or two. Now, we get updates every 30 minutes to an hour.”

Performance gains have enabled KuKu to convert ad hoc analysis into automated pipelines running every 2 hours. Marketing teams can monitor campaigns in near real time, quickly addressing issues like conversion tracking failures that affect customer acquisition costs. Teams gain immediate visibility into user funnel fluctuations, enabling rapid iteration on features and personalized experiences for listeners.

Rapid growth without proportional cost increases

KuKu has scaled rapidly while keeping costs in check. Since 2022, monthly active users have grown 10x, yet analytics costs increased less than 2x, reducing cost per user by two-thirds. “We can manage large volumes of data while getting faster insights,” Vishal explained. Operational costs dropped 20%–25%, storage costs fell 31% through techniques like table vacuuming, and query and job runtimes improved 30%–40% using Databricks Assistant to optimize Spark execution plans.

A data-first foundation for innovation

The transformation extends beyond immediate metrics. By unifying data, analytics and machine learning on a single platform, KuKu has become a data-first organization with the ability to innovate rapidly. The company is piloting Databricks Genie to democratize data access, allowing teams to query metrics through natural language in Slack. “With Databricks, we gained more control and maturity in our data platforms,” Vishal said. “The unified tool makes it easy to operationalize our data to better engage customers.”

The partnership between Databricks and Google Cloud has been key to this success. Together, Databricks’ intelligence and Google Cloud’s infrastructure resources empower KuKu to scale efficiently, balance performance and cost, and respond in real time to market opportunities — all while improving experiences for every listener.