Skip to main content
CUSTOMER STORY

Powering sustainable finance with data intelligence

ESGpedia leverages Databricks to empower companies in their ESG journey, achieving cost savings and enhanced data insights.

4x

cost savings in data pipeline management

SOLUTION: Data-Driven ESG
PLATFORM USE CASE: Delta Lake,Mosaic AI
CLOUD: AWS

ESGpedia is a leading environmental, social and governance (ESG) data and technology platform, supporting companies across the Asia-Pacific region in their ESG journey toward net zero goals and ESG compliance. ESGpedia partnered with Databricks to create a unified lakehouse architecture, with the aim of resolving challenges posed by the fragmented ESG data landscape. This move slashed costs and enabled the use of large language models (LLMs) with retrieval augmented generation (RAG) techniques. Leveraging the Databricks Data Intelligence Platform, ESGpedia now swiftly processes complex sustainability data, delivering crucial insights to corporations and financial institutions throughout APAC.

Navigating the complexities of ESG data management

Before working with Databricks, ESGpedia was challenged to manage a vast array of data sources, each requiring extensive precleaning, processing and relationship mapping. With about 300 different pipelines running and increasing, the task of bringing new datasets on board was becoming increasingly complex and time-consuming.

Jin Ser, Director of Engineering at ESGpedia, explained the situation: “Fragmented data across multiple different platforms hampered our efficiency and ability to provide timely, personalized insights. Our internal teams struggled to quickly access the necessary information, leading to slower response times and a diminished ability to assist our clients effectively.”

The complexity of managing and coordinating multiple models across various systems proved to be a significant obstacle. This data fragmentation not only affected operational efficiency but also hindered the development of AI-driven initiatives to stay at the forefront of innovation.

Unifying data and embracing AI with Databricks

ESGpedia leveraged several components of the Databricks Data Intelligence Platform. Central to this implementation was a lakehouse architecture that enabled the combination of data storage and management to facilitate easier access and analysis.

“Databricks served as the foundational step for us,” Ser said. “It’s not just about having data; it’s about making it actionable and efficient, which Databricks enabled us to do effectively at scale.”

The Databricks Platform unlocked streaming data capabilities, enabling continuous data ingestion from various sources. Unity Catalog played a critical role in data management and governance, supporting compliance requirements with stringent access controls and detailed data lineage. This unified approach to governance accelerated data and AI initiatives, simplified regulatory compliance and allowed data teams to access data and collaborate securely across ESGpedia’s distributed teams in Singapore, the Philippines, Indonesia, Vietnam and Taiwan.

ESGpedia also embraced Databricks Mosaic AI, paving the way for their use of RAG, a generative AI workflow that uses custom data and documents to provide context for LLMs. The company used the Mosaic AI Agent Framework to develop a RAG solution specifically tailored to improve the efficiency and effectiveness of their internal teams.

“We are looking to provide highly customized and tailored sustainability data and analytics for our customers based on their industry, country and sector,” Ser explained. “A RAG framework can be running on Databricks, and that’s actually what we use. In terms of prompt engineering, we currently are using few-shot prompting to help with the classification of our datasets.”

Transforming ESG insights and achieving operational excellence

The implementation of Databricks has had a profound impact on ESGpedia’s operations and platform enhancements in terms of advancing quality insights. The company achieved a remarkable 4x cost savings in data pipeline management, significantly reducing operational expenses while improving efficiency. The migration to Databricks, which involved transitioning about 300 pipelines, was completed in just six months. This rapid transformation speaks to both the dedication of ESGpedia’s team and the ease of adoption of Databricks’ tools.

“Our time to insight has greatly improved, thanks to our ability to integrate complex data sources using Databricks Mosaic AI,” Ser noted. The unified environment allowed ESGpedia to break down the silos between different types of data and systems, making it possible to harness data for their desired artificial intelligence use cases. The implementation of Mosaic AI Agent Framework’s RAG techniques has enhanced ESGpedia’s ability to provide nuanced, context-aware insights to their corporate and bank clients. Rather than relying on opaque scoring systems, ESGpedia offers corporations and financial institutions granular data points about the sustainability efforts of companies and their value chains (e.g., SMEs, suppliers, contractors, etc.), enabling more informed decision-making across sustainable finance and green procurement.

Looking forward, ESGpedia continues to explore the frontiers of AI and machine learning to revolutionize their operations. By democratizing access to high-quality insights, ESGpedia’s integrated data and AI architecture enhances productivity and empowers employees to perform their roles with greater efficacy. As ESGpedia continues to innovate and expand across the Asia-Pacific region, Databricks is an integral part of their mission to deliver superior ESG insights and drive sustainable business practices across the Asia-Pacific region.