Tailoring credit products to the diverse needs of customers
HDFC Bank moves campaign operations to Databricks to optimize underwriting and risk assessment
Reduction in query time for faster decision-making
Data pipelines for downstream risk management use cases

HDFC Bank, India’s largest private-sector bank, has migrated to the Databricks Data Intelligence Platform on Azure to address limitations in their on-premises data management environment, including data silos, scaling compute cost-efficiently and maintenance complexity, which was costly and resource-intensive. This move has enhanced the bank’s data analytics capabilities, improved operational processes and accelerated time to market for new services to HDFC Bank’s diverse customer base. By leveraging Databricks ETL, analytics and AI/ML capabilities, the bank can now scale data ingestion cost-effectively — powering various business use cases including deriving actionable insights to improve fraud control, optimizing marketing campaigns and refining model building and deployment to support new solutions that improve the customer experience. This digital transformation initiative, part of the bank’s “future-ready” strategy, positions HDFC Bank to maintain their competitive edge in India’s banking sector through improved operational efficiency and customer experience.
Legacy platform complexity slows data insights and inflates costs
HDFC Bank has long been at the forefront of technology adoption, utilizing data-driven decision-making to maintain their competitive edge and provide superior customer service and experiences. However, due to customer growth and increased data management complexity, there were limitations in how they could continue to scale up some of their tech, especially their campaign operations. The Credit Risk Analytics and Innovation (CRAIN) department found that integrating external data for various credit-related applications, including underwriting and fraud prevention, was not easy to achieve. Their infrastructure would face challenges to scale quickly and cost-effectively, resulting in the CRAIN campaign team spending significant time managing complex systems rather than focusing on innovations that better serve customers. This situation highlighted the need for HDFC Bank to modernize their data architecture and enhance their credit risk analytics capabilities, empowering them to improve operational efficiency and decision-making processes.
Ashish Abraham, Head of Analytics and Innovation at HDFC Bank, described the business and technical challenges of their 16-node Hadoop cluster: “We’re responsible for the development, execution and maintenance of campaign-specific legacy ETL jobs, but we didn’t have the computing resources, automation and scalability required.”
HDFC Bank needed to modernize their data infrastructure to a unified, user-friendly and scalable platform with advanced tooling. The process required migrating data, optimizing code and implementing DevOps best practices. The new system needed to seamlessly integrate with third-party, internal and enterprise data sources into a centralized platform. The system also needed to support their robust data processing requirements to speed ETL pipelines for downstream analytics and machine learning use cases. This upgrade was required to enhance data management, improve operational efficiency and support faster, data-driven decision-making processes within the bank.
Streamlining data processing on the Databricks Data Intelligence Platform
The CRAIN department at HDFC Bank opted to migrate to the Databricks Data Intelligence Platform on Azure. This decision was made to centralize data management, leverage integrated tools to boost productivity and operational efficiency and enhance data accessibility and security. Through a metadata-driven ETL framework, they can now expedite data delivery, potentially improving credit process analytics. The migration involved transferring on-premises data to the cloud via a secure VPN, with batch transfers moving data into Delta Lake, which serves as the department’s primary data repository.
This has allowed the CRAIN department to uncover and distribute reliable insights and analytics that fuel use cases to improve credit processes and provide enhanced customer experience services. The data processing workflow on the new platform involves migrated jobs appearing on Databricks, data transformation within Databricks Notebooks using Spark SQL and storage in Delta Lake. Through their migration, they have experienced enhanced analytics capabilities, improved data governance and scalability and faster time to insight.
Ashish explained, “In many cases, there was a requirement to optimize the translated code, perform Apache SparkTM parameter fine-tuning and effectively partition the data, which led to a huge reduction in the overall query execution time and thereby reduced costs. We also parameterized some business variables whose values were hard-coded in the old solution’s ETL jobs.” Now, the bank’s data can be used by other processing applications, ML models, data analytics engines or direct consumption using Power BI reports.
The CRAIN department has implemented strategies to enhance productivity and data governance using Databricks, created reusable notebooks with widgets for parameterization and faster queries and established separate DEV, UAT and PROD environments with DevOps tools for efficient release management. To improve data governance, Unity Catalog offers centralized security management, granular access controls and data lineage tracking. These capabilities have resulted in increased productivity, improved collaboration, enhanced governance and streamlined development processes — allowing for more efficient data management while maintaining security standards.
Platform efficiency accelerates cost-effective use case fulfillment
HDFC Bank’s CRAIN department’s migration to the Databricks Data Intelligence Platform has enhanced their data analytics capabilities for credit-related use cases. The platform has simplified the integration of various data sources, streamlined data processing and improved analytics capabilities — enabling the department to uncover new actionable insights. As a result of the migration to Databricks, they have experienced increased productivity and an elevated customer experience, all while reducing resource constraints and operational costs.
Ashish presented the business impacts CRAIN has realized as a direct result of migrating to the Databricks Platform. He reported, “Our data pipeline is significantly faster, and query time has reduced considerably. We’re processing massive amounts of data monthly, and we have tens of millions of unique customers to learn more from and apply those findings to continue improving business processes.”
HDFC Bank’s CRAIN department is advancing their data maturity using the Databricks Data Intelligence Platform to generate insights for credit policies, analysis and campaign rollout. Following the platform’s success in CRAIN, HDFC plans to extend their use to additional data teams across the bank. This expansion aims to drive more use cases and foster innovation throughout the organization. By leveraging data and AI capabilities with Databricks, HDFC Bank is working to enhance their operations, improve decision-making processes and maintain competitiveness in the financial services sector.