Skip to main content
CPFB | Central Provident Fund Board, Singapore

CUSTOMER
STORY

Modernizing public services through citizen-first innovation

300+

Users trained on Azure Databricks

customer story central provident fund board header

The Central Provident Fund (CPF) is a cornerstone of Singapore’s national social security system, supporting more than 4 million members with home ownership, healthcare financing and lifelong income. The CPF Board (CPFB) also administers key government-managed programs, including the Home Protection Fund, Lifelong Income Fund, MediShield Life, CareShield Life and ElderShield. With such a diverse member base, CPFB must ensure their services are inclusive, effective and relevant. As data demands grew and legacy systems began to limit performance, the organization needed a more scalable, secure way to deliver insights across policy formulation, operations and service delivery — a goal they achieved with Azure Databricks.

Growing demands on data management

The CPFB began their digital journey more than 60 years ago. They were the first government agency in Singapore to install a mainframe computer and automate the manual ledger system that tracked CPF member accounts. From then on, CPFB continuously leveraged data to better serve their members across the country. But as member needs grew more complex, the volume of funds under management increased and new offerings and services were introduced, CPFB’s data systems needed to modernize and evolve. “While our data journey spanned many decades, it intensified about 10 years ago. We put in place an on-premises data warehouse to consolidate data from various sources. Over time, we found that this had limitations as our colleagues needed to perform more demanding analytics on large datasets,” Vance Ng, Director of the Data Science Accelerator (DSA) Department at CPFB, recalled.

As data demands increased, the limitations of the on-premises system became more apparent: Teams were working with increasingly large and fragmented datasets that were difficult to manage efficiently, slowing analytics workflows and decision-making. More granularly, users had to download data from the warehouse to their local machines, and access required navigating multiple layers of approval. Once data was obtained, analysis was often performed using limited computing power on desktops or laptops, creating challenges in both performance and scalability. “Any time we needed to do extensive analysis, not only did we have to deal with managing large datasets stored in individual devices, which presented a security concern, but often we had to request additional laptops beyond the devices we had on hand,” Benedict Ho, Senior Deputy Director in the DSA Department at CPFB, shared.

These challenges not only slowed down analytics and raised concerns around data security and access, but they also limited CPFB’s ability to scale their data initiatives and operationalize machine learning (ML). CPFB sought a scalable platform capable of powering secure, end-to-end analytics without compromising performance or compliance to support their mission of building a modern data infrastructure that delivered greater value to members.

Better leveraging data to serve CPF members

To modernize their data infrastructure, CPFB built a Unified Data Platform (UDP) to make data more accessible, secure and usable across the organization. The goal was to improve how teams worked with data to increase user competency, enable more efficient data use and support better policy development and service delivery for CPF members. CPFB recognized early that change management would be critical to success. Internal users were involved from the start, and a train-the-trainer model was put in place, which allowed early adopters to pass on knowledge and build confidence across various CPFB departments. More than 300 users were trained on Azure Databricks, and training efforts continued beyond implementation, with experienced data analysts partnering with business users on more advanced data science projects. “This shift allowed us to leverage data more efficiently, propose better policies and serve CPF members more effectively,” Benedict explained.

Azure Databricks was selected to give teams access to advanced analytics capabilities in a familiar environment. Collaboration also improved significantly with the adoption of Databricks Notebooks. Teams no longer had to send scripts back and forth over email or manage version conflicts. Instead, they could co-edit notebooks, troubleshoot in real time and work from a single source of truth. Users could build dashboards in Power BI without saving or sending flat files. These changes helped teams move faster and focus more on analysis, rather than on administrative overhead. Databricks Assistant also gave users an AI-powered coding companion to accelerate development and problem-solving across technical and nontechnical teams.

Data that comes into UDP from various source systems in CPFB goes through an extract, transform, load (ETL) process. A data readiness dashboard in Azure Databricks informs users when data is ready for consumption. Before, this was communicated over email after performing quality checks.

Governance and security remained top priorities throughout the transformation. Azure’s native security features, including firewalls, helped maintain the same security posture the organization had with their on-premises data warehouse. UDP was deployed on Singapore’s Government Commercial Cloud 2.0 (GCC 2.0), providing a secure foundation for the modern platform. CPFB also used Microsoft Purview to establish governance policies and a shared data dictionary to guide secure data usage across departments. “We were one of the first government agencies to onboard with Azure GCC 2.0,” Benedict added. “Now, we’re working closely with Microsoft and Databricks to future-proof UDP, enable Unity Catalog and explore the latest generative AI features in Azure Databricks.”

Expanding possibilities with new capabilities

These new tools also helped users work more independently and efficiently. Databricks Assistant became a valuable asset for both analysts and business users by offering real-time coding support that enhanced development speed and problem-solving. “This incredibly powerful assistant made it efficient not just for us to help users but also for them to solve problems themselves,” Jared noted. Report generation became faster and more automated, with simple reports generated in minutes and complex ones scheduled to run overnight. Finally, machine learning models were developed, trained, fine-tuned and deployed securely on a single platform.