by Kacey Hertan
The World Bank Group's mission is to improve shared prosperity across the planet. Achieving that mission depends on turning vast amounts of data into actionable insight. With tens of millions of documents in its knowledge repositories and three million publication downloads every month, the challenge is making that knowledge findable and usable at scale, to empower teams to drive greater global impact.
To do that, the World Bank Group built a unified data and AI platform on Databricks, bringing together structured operational data and unstructured document repositories for the first time, enabling more informed decisions with far less manual research.
The World Bank Group operates both structured and unstructured data streams that have never been integrated. On the structured side, legacy on-premises databases make it difficult to keep pace with evolving reporting requirements. On the unstructured side, researchers and analysts had to manually search through enormous document libraries to answer basic questions.
"How do I go and look for a project that was executed in India in 1960? What are the pitfalls of that? What went well?” says Suresh Kaudi, a data and AI leader at World Bank Group. “We had no clue. Librarians, researchers would go in and pull tons and tons of documents, try to read through them, try to make sense out of it."
That knowledge bottleneck slowed decision-making and limited the organization's ability to surface lessons learned across its global portfolio.
World Bank Group began its Databricks journey with a focused goal: modernizing its data platform and migrate structured content off legacy systems. As that effort matured, the team identified Databricks as the platform capable of solving this challenge.
As Kaudi puts it, Unity Catalog was a turning point for the team. "Unity Catalog was a game changer for us. It was a single unified interface where we could govern our data," says Kaudi. From there, Databricks Volumes gave the team a scalable path for managing unstructured document content alongside structured data in the same platform. Genie enabled business users to ask natural language questions against structured data without writing SQL or relying on technical teams. The Databricks AI Gateway provided centralized control over agent access, cost management and security as the system grew more complex.
With the critical technology in place, the World Bank Group was ready to begin the most impactful phase of implementing a solution that would bring their vision of data democratization to life. The World Bank Group's implementation evolved in stages, each building on the last. The team started by migrating operational data into Databricks and using Unity Catalog to establish governance across structured content. This laid the foundation for the organization's corporate scorecard, a public-facing accountability tool.
"It's more outcomes-driven than output-driven,” says Kaudi. “Instead of saying how many miles of road we put in, it started measuring how many jobs we gave, how much connectivity was established." When early Genie deployments returned inconsistent results for structured queries, the team implemented a metrics layer to ensure they got deterministic answers, critical for financial and operational reporting.
"In the structured content, you need an answer. What is my bank balance? I don't want to see a different number every time,” explains Kaudi. The team then turned to unstructured content. Using Databricks Volumes and vector search, they indexed project documents to create a retrieval-augmented generation capability that could respond to natural language queries and thus save manual search.
This then created a new problem. Each Genie instance is built against a specific metrics layer, meaning a separate Genie is needed for each data domain. A question that spans two domains, for example "what is my commitment in India and what are my actions," would require querying two separate Genies.
The solution was an agentic layer on top. The World Bank Group built a single interface backed by an intent classifier, a domain classifier and a query decomposer. When a question comes in, the intent classifier identifies what's being asked, the domain classifier determines which agent or agents need to be called, and the query decomposer breaks complex multi-part questions into components and routes each to the right place. Results are assembled and returned as a single response.
It’s not unlike traditional multi-tier web design, with front end, application layer, business logic and database, updated for an AI context. The user sees one interface, but behind it, any number of domain-specific Genie agents can be running, alongside the RAG agent for document retrieval and a visualization agent that controls how results are displayed. If a query returns data as a bar chart and the user wants a pie chart instead, the visualization agent handles that without re-running the underlying query.
Before expanding the system broadly, the team ran structured feedback sessions with external stakeholders including NGOs, civil servants and government representatives across Africa and East Asia Pacific regions. They used AI/BI to capture query inputs, routing decisions and outputs, then analyzed results to understand what questions users were actually asking and where gaps existed.
"We had to collect the feedback externally as well,” says Kaudi. How is the World Bank Group helping them? What kinds of questions do they ask? So that we can be more proactive."
The platform now supports three million document downloads per month through an AI-powered search and synthesis layer, with half of that traffic coming from low- and middle-income countries. The user feedback prototype spanning multiple global regions was built and deployed in approximately two and a half days.
"Imagine doing this with a project,” says Kaudi. “Two years back I would have imagined doing it in a two-year span. But this was done quickly, on the fly, to get the real value out of it."
The corporate scorecard was delivered on the Databricks platform. Analysts can now retrieve valuable data and context in a single query, eliminating the need for manual document search. The World Bank Group is working to bring all of this together in their flagship Knowledge 360 and Data 360 projects. The goal is to bring together World Bank Group, IFC, IDA and MIGA through flagship initiatives so that knowledge is accessible to any stakeholder regardless of which institution generated it.
The long-term stakes go beyond operational efficiency.
See how Databricks helps organizations unify data, govern AI and turn knowledge into action at global scale.
Subscribe to our blog and get the latest posts delivered to your inbox.