Application fraud prediction using Machine learning model adds 10% revenue
Advanced Machine Learning models to identify fraud customers at the point of application
Fraud is a billion-dollar business and it is increasing every year. Traditional methods of data analysis have long been used to detect fraud. They require complex and time-consuming investigations that deal with different domains of knowledge like financial, economics, business practices and law. Fraud often consists of many instances or incidents involving repeated transgressions using the same method. Fraud instances can be similar in content and appearance but usually are not identical.
The first industries to use data analysis techniques to prevent fraud were the telephone companies, the insurance companies and the banks (Decker 1998). One early example of successful implementation of data analysis techniques in the banking industry is the FICO Falcon fraud assessment system, which is based on a neural network shell.
In general, the primary reason to use data analytics techniques is to tackle fraud since many internal control systems have serious weaknesses. In order to effectively test, detect, validate, correct error and monitor control systems against fraudulent activities, businesses entities and organizations rely on specialized data analytics techniques such as data mining, data matching, sounds like function, Regression analysis, Clustering analysis and Gap. Techniques used for fraud detection fall into two primary classes: statistical techniques and artificial intelligence.
About the Client
Equifax Inc. is a global information solutions company that uses trusted unique data, innovative analytics, technology and industry expertise to power organizations and individuals around the world by transforming knowledge into insights that help make more informed business and personal decisions. Equifax operates primarily in the business-to-business sector, selling consumer credit and insurance reports and related analytics to businesses in a range of industries. Business customers include retailers, insurance firms, healthcare providers, utilities, government agencies, as well as banks, credit unions, personal and specialty finance companies and other financial institutions.
In 2010, Equifax established a presence in India market and was licensed by RBI to operate as a Credit Information Company. Equifax India, registered as Equifax Credit Information Services Private Limited (ECIS). It is a joint venture between Equifax Inc., USA and seven leading Indian financial institutions.
The bureau data driven fraud score helps in creating a score that has low correlations with credit risk scores and thereby effective in being used in dual score strategies. It can also be used to a very limited extent on new to credit applicants based on address information provided by the applicant, and on enquiry only records. The continuous learning approach helps in keeping the model abreast of emerging fraud hotspots.
But most importantly it allows the exploration of “links” between fraud cases, methods based on application and document data are far less effective in capturing such links.
- Less than 20% correlation between the current credit risk score in production.
- Provided accuracy of more than 65% overall with segments like two-wheeler having accuracy of ~78%.
- Estimated increase in revenue for Equifax of around ~10%.
Voxco Intelligence developed a machine learning solution to predict fraud risk leveraging the unstructured header information present in the credit bureau file. A self-learning loop was also created to update the model based on monthly data updates. The feature library created was generic enough to be applied to the entire bureau file.
A random extract of the bureau data was used for building the feature library. The extract consisted of the current and past header data (address, phone number, etc.) and the trades and inquiry data. The standard features from the inquiry and trades data were created using the current feature library that is already being used by Equifax India. The features from the trades and inquiry data were used to create interactions with the header data and also possible segmentations. Some additional features to identify fraud (from first payment default and straight roll) were also created to understand geo clusters or linkages that were further used to create features around the geo fraud risk index.
Once, the feature library was created, confirmed application fraud and first payment default with straight roll to write-off were used as the bad definition. A deep learning methodology was used to identify the features that were going into the model. Voxco Intelligence leveraged its proprietary parameter tuning methodologies to fine tune the model in the development and cross validation samples. Further validation was also performed on the out-of-time validation sample.
The final models were also validated across different segments, like two-wheeler loans, credit cards, gold loans, and other unsecured/semi-secured products. Additionally, Voxco Intelligence also helped to validate the models on two different sets of customers of Equifax.
Voxco Intelligence also provided knowledge sharing and training sessions to Equifax internal team for their capability development as well as to ensure smooth transition and maintenance of the models developed.