Skip to main content
MIGRATION STORY

E-commerce democratized for millions of users in India

30%

Lower TCO

50%

Faster runtime for long-running queries

Meesho

“With multiple petabytes of data and several terabytes added daily, there was a dire need for a platform that could scale up to support our needs. Migrating to the Databricks Data Intelligence Platform has helped Meesho scale data ingestion and handle 100x data volume growth in a few months while lowering operating costs by 30%. Most importantly, this allows Meesho to democratize e-commerce for everyone in India and emerge as the country’s only true e-commerce marketplace.”

— Katreddi Kiran Kumar, VP Data Platform, Meesho

The thirst for e-commerce in under-penetrated markets — which gained access to the internet only several years ago — is insatiable. Meesho, India’s only true e-commerce marketplace, targets these regions by providing a platform for small businesses to enter and flourish in the online market without significant investment. Serving more than 140 million transacting users across India, Meesho understood the need to help small businesses improve operational efficiencies while driving consumer engagement with customized shopping experiences. With the explosion in data volumes, Meesho migrated to the Databricks Data Intelligence Platform to efficiently meet their data ingestion requirements and deploy AI to help suppliers understand shopper behavior, provide inventory recommendations and better detect fraud. After migrating to Databricks, Meesho was able to easily and cost-effectively scale to meet data demands — helping them to shift their focus to delivering an engaging online shopping experience for customers and suppliers.

Migration to lakehouse architecture helped reduce operational complexity and runaway costs

Meesho’s platform serves millions of users (consumers and suppliers) across India, many of whom are entrepreneurs that don’t have the resources to run a traditional retail business. As they transitioned their business from a marketplace for suppliers to connect with consumers to a direct-to-consumer model, data volumes exploded.

“With the need to support multiple petabytes of data and several terabytes added daily, our existing platform had limited flexibility to scale to meet our growing needs,” said Alok Sharma, Director of Engineering at Meesho. “These frequent spikes in data activity were too much for us to effectively fulfill business transactions. As a result, operational costs were adding up and we were struggling to help our suppliers meet the growth in customer shopping demands.”

Previously, Meesho’s fast-growing internal technical and data teams had limited flexibility to handle varying data workloads, while scaling clusters to support unexpected increases in aggregate workload volume was highly resource-intensive. As transaction volumes grew, so did the effort required by the team to maintain their existing platform, dramatically increasing TCO. Additionally, with a data latency of three to four hours, Meesho’s data engineers and analysts struggled to harness real-time data to feed insights to suppliers and shopping experiences to customers promptly. This greatly limited the reliability and accuracy of the data to be used for their thousands of machine learning models designed to improve supplier operations and consumer engagement.

“Our status quo was not sustainable as we continued to adopt real-time analytics and machine learning,” said Sharma. “And with a multicloud strategy on the horizon, we had to find a solution that could help us achieve our goals while reducing TCO.”

Migrating to the lakehouse provides real-time data for a better customer experience

With multiple petabytes of data spread across eight transactional databases and numerous clickstream sources, 2,000+ ML models, and 400+ BI reports used by over 200 data users, Meesho knew the migration would be a significant undertaking. More importantly, they had to ensure a successful migration with little downtime and no impact on the user experience. Before the migration began, the team measured the different-sized workloads they encountered as part of their daily business. They decided the best approach was to sequence the migration — data workloads with low dependencies got migrated first, beginning with data ingestion, followed by consumption workloads. During the migration, the team optimized the workloads, especially long-running queries, so they could run more efficiently on other tools they used such as Starburst and Presto.

With support from Databricks Professional Services, the migration went as planned. Now with Databricks Data Intelligence Platform serving as the foundation for their data infrastructure, Meesho has experienced improved speed and efficiency of their data ingestion and data consumption workflows — scaling workloads on demand while reducing data latency from three to four hours to less than one hour. With proper planning and expert support, the migration was completed in six months. “The migration went smoothly with zero downtime. With Databricks Data Intelligence Platform, our data can now help us march to our goal of personalizing the customer shopping experience,” said Sharma.

Databricks Data Intelligence Platform reduces TCO and personalizes the shopping experience

In the initial stages of evaluation, Meesho was concerned about the unit price differences between their existing platform and Databricks. However, further analysis and POC confirmed the potential TCO reductions. After migrating to the Databricks Data Intelligence Platform, Meesho has seen immediate improvement in operational costs and efficiency. TCO analysis across different workloads confirmed the lakehouse could better scale data ingestion and lower operating cost by 30%. Data processing and analysis performance also improved across the different data workloads. With dynamic cluster management capabilities that can scale on demand, expensive long-running queries not only ran 50% faster but also 30% less expensive due to reduced storage costs. “During Diwali, our biggest shopping season of the year, we saw a record 8.5 million orders in one day, and our ability to support that significant scale was painless,” said Sharma.

From a data analytics perspective, Meesho’s data teams now have seamless access to fresh and reliable data to satisfy customer use cases such as sales forecasting, revenue analytics, marketing spend optimization and churn prevention. With the Databricks Data Intelligence Platform, data practitioners have more readily available and higher-quality data to use for their ML models. Faster data availability to business-critical systems helps drive Meesho’s business outcomes including monetization of product mix, inventory recommendations for suppliers and a more personalized experience for customers. Meesho was also able to provide data analytics to suppliers, improving their ability to source inventory based on what is popular, and helping them to optimize forecasting and pricing for their product mix.

Since migrating to the Databricks Data Intelligence Platform, Meesho has modernized its data infrastructure, and saved on operational costs while boosting capacity for innovation. Having access to fast and reliable data will help Meesho become more data-driven and enable them to deliver an exceptional online shopping experience to their millions of customers and suppliers into the future.