Skip to main content

Delivering next-level gaming experiences that keep players coming back


Personalized gaming experiences delivered daily


From iconic games like Sonic the Hedgehog to recent favorites like Warhammer, SEGA Europe has delighted gamers worldwide for decades. With 30 million customers, SEGA collects more than 25,000 valuable data events every second on player behavior and in-game interactions. During the COVID-19 pandemic, SEGA saw a 2x spike in active players but struggled to gain actionable insights with their legacy infrastructure. They lacked the computing capacity to handle the massive amounts of incoming data in an environment that would enable their data science teams to deliver actionable analytics. Now on the Databricks Data Intelligence Platform, SEGA is able to democratize all their data for analytics and machine learning (ML) to deliver a personalized gaming experience through relevant player communications and updates, as well as new products that help reduce churn and increase revenue.

Struggling to harness massive amounts of gaming data

SEGA has long been an innovative powerhouse in the gaming industry. As one of the pioneering gaming development houses, SEGA has churned out multi-million-selling game franchises, including Sonic the Hedgehog and Football Manager. With over 30 million customers, SEGA is still focused on creating amazing gaming experiences — but now with data as the primary driver. With 25,000 events collected every second, they are now able to harness their data to create an interactive gaming community in which the player experience is personalized across all touchpoints, from interactions with customer service teams to new in-game features that drive engagement.

Unfortunately, their dispersed environment made it difficult for data teams to operationalize the unstructured and streaming data efficiently and effectively for analytics and ML. Stanley Wang, a Data Scientist at SEGA, remembers, “At the time, it was difficult for me to process our data because it was stored on different platforms and architectures. We had things stored on S3, Redshift, and others were placed on Microsoft Azure.” He goes on to say, “One of the biggest headaches was managing the ingestion of all these data sources in one place so that we could use them for ML projects.” Using Jupyter notebooks on their local machines, the data science team spent inordinate amounts of time accessing and importing data from the various data sources.

And with limited compute on a laptop, they weren’t able to scale exploration and model training, greatly limiting their ability to deliver insights to the product teams for game innovations and the studio teams to help with commercial and marketing decision-making. Felix Baker, a Data Services Manager at SEGA Europe, explains, “We had bottlenecks when three studios tried to make analytical queries on the same Redshift table at the same time, while a fourth had already launched a job lasting 10 hours, blocking its use.” SEGA Europe needed a unified approach to manage the amount of data they ingest, as well as an open environment that enables collaboration among teams.

Moving to the lakehouse democratizes data, AI and BI

After testing other data and AI platforms, SEGA chose the Databricks Data Intelligence Platform on AWS to serve as its foundational data platform for data engineering, analytics and data science. Felix explains, “We tried cloud-based data warehouses, but when tested, they didn’t have sufficient ingest capabilities for our streaming needs.” With Delta Lake, they can easily scale compute to handle volumes of structured and unstructured data — including financial data, anonymized customer information, and in-game behavior and analytics data. Plus, they could deliver more efficient and streamlined data pipelines to feed BI reports and ML models — all focused on enhancing the gaming experience.

“The lakehouse architecture suits our needs perfectly,” says Francis Hart, Director of Online Technology at SEGA. “We can store data in one location that provides all data teams with access to what they need in near real-time.”

Collaboration across data teams, underpinned by interactive workspaces and support for multiple programming languages, has also improved data productivity and efficiencies. “We now have a single data team within the business,” explains Francis. “We’ve solved problems across the business together rather than working within individual silos.”

SEGA is now using Databricks SQL to track key metrics to cohort users by playstyle via BI reports and, by including real-time data to drive community activities, to help evaluate new features, identify opportunities to better engage their community, prevent pirated usage, improve player engagement and more. They have also developed their own ML algorithm to tailor games and updates to players based on interactions. For example, if new players struggle to establish themselves in a game within a certain period of time, SEGA examines and updates the UI for increased ease of use.

Stanley adds, “With the help of Databricks, we have been able to completely transform the data science role here and be a key pillar of decision-making for the business.”

More gaming insights, more engaged gamers

Since using Databricks, SEGA has unlocked gaming insights to improve the player experience and boost monetization opportunities through targeted product innovations designed to drive engagement and revenue. Data science and studio teams are working faster and more efficiently because data sets are available in one centralized environment that enables simple model execution. Streamlined data ingestion on the scalable Databricks Data Intelligence Platform provides SEGA with more usable data than they’ve ever had before. Felix says, “With the previous architecture, we managed to collect data every half-hour, at best. With Databricks we can collect them every minute.”

With Databricks Data Intelligence Platform as the foundation for their data analytics and ML infrastructure, SEGA is poised to productionize more use cases, including sentiment analysis on social media to gauge prerelease excitement and post-update reviews; player behavior analysis to uncover “unsuspected” styles of play; gaining real-time game statistics during streamer broadcasts to further customize communication; and analyzing distributor data for sales and financial forecasts.

Building a loyal and prosperous community takes a collective effort. With data insights generated by their passionate customers at their fingertips, SEGA is now positioned to deliver a range of customer-centric experiences designed to improve brand loyalty, reducing overall churn while increasing revenue today and into the future. “Having better and faster insights into our data allows us to deliver a better community gaming experience that increases customer satisfaction for not only our games but also the entire SEGA experience,” concludes Francis.