Seamless and secure access to data has become one of the biggest challenges facing organizations. Nowhere is this more evident than in technology-led external audits, where analyzing 100% of transactional data is fast becoming the gold standard. These audits involve reviewing tens of billions of lines of financial and operational billing data.
To deliver meaningful insights at scale, analysis must not only be robust but also efficient — balancing cost, time, and quality to achieve the best outcomes in tight timeframes.
Recently in collaboration with a major UK energy supplier, KPMG leveraged Delta Sharing in Databricks to overcome performance bottlenecks, improve efficiency, and enhance audit quality. This blog discusses our experience, the key benefits, and the measurable impact on our audit process from using Delta Sharing.
To meet public financial reporting deadlines, we needed to access and analyze tens of billions of lines of the audited entity's billing data within a short audit window.
Historically, we relied on the audited entity's analytics environment hosted in AWS PostgreSQL. As data volumes grew, the setup showed its limits:
Given these constraints, we needed a scalable, high-performance solution that would allow efficient access to and processing of data without compromising security or governance, enabling reduced ‘machine time’ for quicker outcomes.
Delta Sharing, an open data-sharing protocol, provided the ideal solution by enabling secure and efficient cross-platform data exchange between KPMG and the audited entity without duplication.
Compared to extending PostgreSQL, Databricks offered several distinct advantages:
We introduced Delta Sharing in a way that did not disrupt ongoing audit work:
We used Delta Sharing to access and analyze billions of meter readings across millions of their customer accounts., We observed significant improvements across multiple KPIs:
Using Delta Sharing has made a noticeable difference to our audit process. We can securely access data across cloud platforms-without delays or manual data movement-so our teams always work from the latest, single source of truth. This cross-cloud capability means faster audits, more reliable results for the audited clients we work with, and tight control over data access at every step. — Anna Barrell, Audit partner, KPMG UK
A couple of technical considerations of working with Databricks that should be considered:
• Delta Sharing: As early adopters, some features weren’t yet available (for example, sharing materialized views) though we’re excited that these are now refined with the GA release and we’ll be enhancing our delta sharing solutions with this functionality.
• Lakeflow Jobs: Currently, there is no mechanism to confirm whether an upstream job for a Delta Shared table has been completed. One script was executed before completion and led to an incomplete output, though this was quickly identified through our completeness and accuracy procedures.
Delta Sharing has proven to be a game-changer for audit data analytics, enabling efficient, scalable, and secure collaboration. Our successful implementation with the energy supplier demonstrates the value of Delta Sharing for clients with diverse data sources across cloud and platform.
We recognize that many organizations store a significant portion of their financial data in SAP. This presents an additional opportunity to apply the same principles of efficiency and quality at an even greater scale.
Through Databricks’ strategic partnership with SAP, announced in February of this year, we can now access SAP data via Delta Sharing. This joint solution, which has become one of SAP's fastest-selling products in a decade, allows us to tap into this data while preserving its context and syntax. By doing so, we can ensure the data remains fully governed under Unity Catalog and its total cost of ownership is optimized. As the entities we audit progress on their transformation journey, we at KPMG are looking to build on this traction, anticipating the additional benefits it will bring to a streamlined audit process.