Delivery potential limited by slow data ingestion and analysis
Gousto’s mission is to change the way people eat through the delivery of boxes of fresh ingredients and easy-to-follow recipes. But the company is much more than a food delivery service. Gousto aims to leverage data and AI to create a more convenient and personalized experience for their customers. In the words of Gousto’s Chief Technology Officer, Shaun Pearce, “Gousto is a data company that loves food.” However, even before the massive pandemic-driven increase in demand, Gousto was at a crossroads. “We were at capacity. Demand was huge, but we couldn’t scale the fulfillment supply chain,” said Pearce.
At Gousto, recipe boxes take a journey through a fulfillment center on a conveyor belt, past a series of stations where boxes stop for an agent to pack ingredients. An efficient journey means more boxes per hour. Detailed box ingredients and stock inventory data were being collected, but couldn’t be analyzed quickly enough to optimize their route or anticipate ingredient availability. The ETL batch process was taking over two hours — if it worked — meaning Gousto was always looking back at historical data and relying on ad hoc observations to make strategic decisions for the business. This latency was impacting a key performance measure — on time in full (OTIF). “We knew we had to rethink our ETL to deliver production-line data in near real-time,” said Eoin O’Flanagan, who heads up data engineering at Gousto.
Added to this, Gousto had disparate systems and data sources managing warehouse stock levels and ingredient replenishment at each pick station, so it was impossible to gain a unified view into the data without significant manual effort. “The systems were not designed for querying and analytics, and we needed visibility across the whole line to make efficiency improvements,” said O’Flanagan.
The time wasted on troubleshooting failed data ingest batch jobs, and maintaining infrastructure was time taken away from developing the more sophisticated data analytics. “We needed to reduce infrastructure tasks so we could focus on collaborating to write code, not dealing with pipeline and EMR issues, so that new ideas could become a reality much faster,” said O’Flanagan. Gousto needed to move from daily batch updates using costly EMR to near real-time streaming data.