Yes Bank

Analytics & Compliance with Salesforce-Hadoop Integration

Business Analytics

Introduction

Yes Bank, a leading Indian private sector bank, sought to enhance its data analytics capabilities and address regulatory requirements. Auriga IT was engaged to develop a solution that seamlessly transferred large volumes of customer and loan application data from Salesforce to Hadoop for deeper insights.

About Client

YES Bank is a leading private sector bank in India, headquartered in Mumbai. Founded in 2004, it offers a wide range of banking and financial services to retail, corporate, and MSME (Micro, Small, and Medium Enterprise) clients.

Problem Statement

Yes Bank faced several challenges in effectively analyzing its vast Salesforce data, including incompatible data formats, the sheer volume of data, and the need to seamlessly integrate large historical datasets with continuous real-time updates into Hadoop. All these hindered its ability to fully leverage its data for analytics and comply with regulations. These challenges include:

Data Format Incompatibility: Salesforce data was stored in a format not directly compatible with Hadoop, hindering efficient analysis.
Massive Data Volume: The massive volume of Yes Bank’s data presented a challenge for traditional data transfer methods. The solution needed to efficiently handle two distinct data sets:
- Historical Data: A large volume of customer and loan application data accumulated over time in Salesforce that required migration to Hadoop for analysis.
- Real-Time Data Updates: Continuously updated records in Salesforce needed to be seamlessly integrated with the historical data within Hadoop to ensure a complete and current data set.
Regulatory Compliance: The need to migrate data to Hadoop was driven in part by regulatory requirements for robust data storage and analytics.

Solutions

Auriga IT implemented a comprehensive solution leveraging Apache NiFi to address the challenges:

Data Extraction with Apache NiFi: Apache NiFi, a robust data flow management tool, was used to fetch data from Salesforce in batches, addressing the volume issue.

Determine the right frequency by balancing the input flow and back pressure with targets.
Nifi gives the platform but to solve the use case some basic processors had to be modified and new custom processors were introduced.
Data Loading in Hadoop: Output file sizes were controlled as per Hadoop block size for optimum storage
Data Integrity- Duplicacy and Data Reconciliation was a major challenge. The logic had to be rock solid to avoid duplicate data and avoid missing any record.
Dynamic Salesforce Data: Auriga IT employed a “last modified date” approach. This ensured that updated Salesforce records were automatically fetched and updated within the Hadoop environment.
Data Transformation: The extracted data was transformed from JSON to CSV for optimal compatibility with Hadoop. Scripts were written to load these csv files in Hadoop.

Custom Soap Processor: All loan related data is migrated to hadoop while files related to loan are migrated to Newgen. For this custom soap processors are written as the legacy code was supporting Soap instead of Rest APis.
PowerBi : PowerBi is connected to Hadoop. All analysis is done on Power Bi.

Business Impact

Retain Loan Data- Yes Bank was able to generate actionable loan-related insights allowing them to focus on key metrics like First-Time Right (FTR) rates, loan processing times, and loan application rejection ratios.
Enhanced Analytics: The Hadoop integration empowered Yes Bank to conduct in-depth analytics on customer behavior, loan applications, and more, leading to better decision-making.
Data-Driven Insights: Hadoop’s processing power enabled the Power BI team to build an analytic layer to generate actionable insights for optimizing business strategies.
Seamless Data Flow: Yes Bank now benefits from a streamlined data flow from Salesforce to Hadoop, with data updated daily & automatically.
Efficient Data Processing: Hadoop effectively processes large data volumes without errors, providing a reliable foundation for analytics.