It’s time for lenders to stop mislabeling millions of Black Americans
Despite legislative action and social justice efforts, the homeownership rate for Black Americans has not changed significantly since the 1960s. Consumer advocates and fair housing groups set a goal to increase Black homeownership to 60 percent by 2040. However, most of the solutions to get there, such as building more affordable housing, providing better financial education and ensuring wider access to fair and affordable credit, are complex and require broad coalitions to solve.
There is one relatively simple fix that would have an immediate impact on the Black homeownership rate — improve how we estimate race in lending.
To comply with federal fair lending laws, banks and credit unions must prove they don’t discriminate based on race and other protected statuses. But lenders aren’t allowed (except in mortgage lending) to ask the race of the applicant. And, even in mortgage lending, almost a third of applicants put nothing down.
Zest's Race Predictor identified Black borrowers with 23 percent more accuracy than the current method when assigning race from the highest probability. It also cut the numbers of Whites identified as nonwhite by 61 percent.
In the absence of data, lenders, regulators and credit bureaus have to guess. The de facto way to do that is with a simple formula called Bayesian Improved Surname Geocoding. The RAND Corporation developed BISG 20 years ago to study discrimination in health care. It brought much-needed objectivity to fair lending analysis and enforcement with a simple formula that combines last name and ZIP code, or Census tract, to calculate the best estimate. RAND said BISG was right at least 9 out of 10 times in identifying people as Black, especially in racially homogenous areas.
The problem is that our country is not racially homogenous, and the predictiveness of surnames gets less accurate every year as neighborhoods diversify and densify, and as the rate of racial intermarriage increases. A 2014 Charles River Associates study on auto loans found BISG correctly identified Black American borrowers 24 percent of the time at an 80 percent confidence threshold. The CFPB, using a different set of loans, found that BISG correctly identified only 39 percent of Black Americans.
Take that in for a second. The formula responsible for policing fair lending is mislabeling white Americans as Black and Black Americans as white. This might not seem so problematic on its face. However, if a Black applicant is counted as white, and the loan application gets declined, then that decline is counted as a white decline, not a Black decline. How can we close the racial wealth gap with a broken yardstick?
We’re not saying to throw BISG out, but let’s use it only until a better alternative is ready. Data science has advanced since Bayesian algorithms debuted in the 1800s. We should harness the latest tech for good, and there’s some promising work already being done out there.
Zest AI’s data science team, for example, built a BISG replacement called the Zest Race Predictor. It’s a relatively simple machine-learning model that estimates race using first, middle, and last names and a richer location data set. We trained and tested ZRP using a national sample of Paycheck Protection Program loan borrowers. ZRP identified Black borrowers with 23 percent more accuracy than BISG when assigning race from the highest probability. It also cut the numbers of whites identified as nonwhite by 61 percent.
This is not some academic exercise. More accurate race labels make for better fair lending analysis, which produces more equitable lending
Zest AI worked with a Florida auto lender to create two fairer alternative models to its baseline model. The first one we created using BISG, and the other with ZRP. The result? The ZRP-trained model did a much better job closing the approval rate gap between White and Black borrowers — from 85 percent to 91 percent.
Fixing race estimation is low-hanging fruit on the long to-do list of actions to restore equity to our economy. Banks and credit unions should urge their model developers (including vendors like FICO, Vantage and others) to use better techniques for fair lending analysis. Congress should also encourage the CFPB to adopt new approaches to estimating race.
We have the data. The US Census Bureau started releasing 2020 data. Government-sponsored enterprises like Fannie Mae and Freddie Mac have a vast trove of Home Mortgage Disclosure Act data and, while under federal conservatorship, the Federal Housing Finance Agency could make it available to federal agencies to study.
We’re happy to give our algorithm away to any regulator or lender that wants to work with it.
Let’s capitalize on the opportunity to use public data for the public good and protect our most vulnerable populations. Let’s create a new generation of credit scoring models that fairly allocate private and public resources, including the Treasury dollars used to guarantee mortgage loans. Now is the time to use every tool at our disposal to root out inequity methodically and ensure fairer outcomes for everyone.