Fair Lending & Compliance

Getting Adverse Action Notices Right for Machine Learning Credit Models

Jay Budzik

November 2, 2020

A significant technology shift is underway in consumer lending as the industry switches over to machine learning underwriting models for more accuracy in assessing credit risk. One question lenders face if they want to use ML: How do I parse a complex model well enough to tell borrowers exactly why they got turned down for a loan or higher credit line.

Under the Fair Credit Reporting Act and Equal Credit Opportunity Act, lenders must send a notice of adverse action (or NOAA) upon a consumer’s denial or rejection (see 12 CFR 1002.9). The notice needs to include the applicant’s credit score (if they have one) and up to four critical factors in the adverse action. Those factors should reflect the principal reasons the applicant was denied and accurately describe and relate to the factors considered in scoring a borrower. These reasons include factors such as high bank card balances, late payments, or bankruptcies. The NOAA mandate provides transparency into the underwriting process, serves as a check on discrimination, and educates consumers on how to improve their credit.

The looming problem is that the methods used today to generate adverse action reasons for linear models are horrendous at identifying the correct denial reasons for machine learning models. ML models are more accurate but more complex, requiring a rigorous approach to ensure declined applicants get the right reasons. Consumers deserve the truth, the law requires it, and banks need valid reason codes to inform internal analyses, including fair lending analysis.

Industry Coming Together to Improve the NOAA Process

In early October, the Consumer Financial Protection Bureau held a virtual “tech sprint” where teams from financial services and tech companies and NGOs collaborated on projects to improve the NOAA process. Team Zest joined with executives from First National Bank of Omaha, Citizens Bank, and WebBank to present an analysis of how badly standard methods fail at producing accurate NOAAs for machine learning underwriting models. We also showed how more rigorous methods could deliver the right results every time.

The industry currently generates reason codes today is a technique called “drop one,” in which each variable in a credit underwriting model is deleted one at a time. The model is re-run to observe and note how that removal changes the score. Those reason codes associated with the variables that drive the most significant difference in score go into the NOAA. In a variation of drop one called “impute median,” each feature is assigned a median value across approved borrowers, and a similar ranking process applies.

The problem with drop one and impute median is that they can only handle single-direction, single-variable relationships (an increase in variable A always increases the score). A machine learning model can have hundreds of credit variables. Each variable can have a positive or negative influence on the score when considered in combination with any of the other variables. You need serious math to explain what’s going on inside the machine learning model and to quantify the influence of each input on the model’s score. Zest’s approach is based on Nobel-prize winning mathematical methods developed by Lloyd Shapley, Robert Aumann, and Olga Bondareva. Zest’s approach to model explainability ensures our clients’ models are rendered transparent by using provably correct, game-theoretic methods to quantify the importance of each variable used by a model to determine a score.

Real-world Use Case Highlights Power of More Rigorous Method

To demonstrate the chasm of correctness between the status quo and game-theoretic approaches, we built a machine learning underwriting model and scored 1,000 loan applications through the model. We produced adverse action reasons on the rejected population using drop one, impute median and game-theoretic methods. Drop one had severe problems, reporting the wrong top reason 98% of the time. Impute median never got the top reason right. The game-theoretic approach was correct 100% of the time. We’ve seen similar results when analyzing millions of consumer credit applications for our customers before the tech sprint.

The two common techniques for understanding linear models cannot parse ML models.

Inaccurate Reason Codes are Bad for Lenders and Consumers

The wrong methods can lead lenders to the wrong conclusions about denials for protected-class applicants (people of color, women, immigrants). Regulators need valid denial reasons because they’re often using them in their fair lending analysis. This study by the CFPB revealed the most common reasons for denying mortgages to people of color. It was based on publicly-reported adverse action reason codes. In the sample model that we built for the CFPB tech sprint, the more rigorous mathematics of game theory identified low credit limits, inquiries, and length of credit history as the top three denial factors for African-American applicants. None of those reasons came up in the top three when using drop one and impute median.

Inaccurate reason codes make it impossible for consumers to know the right reasons why they got denied. Wrong codes also scrambling a consumer’s ability to correct errors in their credit file or to improve their standing as a borrower by addressing shortcomings in their credit file that affected the outcome of their credit application. Let’s say a lender using a machine learning credit model denies Ralph a loan for a $25,000 Honda Civic. Using drop one, the lender might end up telling Ralph he got rejected because the price of the car was too high. So Ralph goes off thinking he needs to buy a cheaper car. But the real reason the lender rejected Ralph was that his revolving account balances were too high. What Ralph needs to do is pay down his credit cards.

Status Quo Hinders Our Ability To Make Lending Fairer

Wrong reasons also make it impossible to address systemic racism in lending. A bank’s internal model reviewers and compliance teams, whose job it is to protect the financial institution from legal risk, needs to know of any problematic variables used to deny a consumer’s loan application. Consumer advocacy groups use that same information for pursuing legal remedies to bias and discrimination cases. Valid adverse action reasons are a crucial tool in fair lending enforcement, and banks need to get it right.

Practical implementations of rigorous methods for generating notices of adverse action are available now. They work on all kinds of models: logit, gradient boosted decision trees, neural networks, and combinations of all three. These methods can help lenders understand how their models work and mitigate disparate impact by giving lenders and regulators the accuracy they need to spot correlations between race and particular reason codes. That can make lending more inclusive. Precise reason codes can inform new underwriting standards to build special-purpose credit programs that increase approvals for people of color. Exact reasons also ensure that all denied applicants receive more valuable intelligence to correct errors in their credit files and improve their approval chances in the future.

It’s essential to get this right. Too much is at stake. Are you considering adopting a machine learning model in your lending business? Ask your modeling team about how they plan to generate notices of the adverse action and whether they can prove their method produces reliable results.

Photo by Artem Podrez from Pexels

Thank you for subscribing!
Something went wrong while submitting the form.