Skip to main content
About HEC About HEC Faculty & Research Faculty & Research Master’s programs Master’s programs MBA Programs MBA Programs PhD Program PhD Program Executive Education Executive Education Summer School Summer School HEC Online HEC Online About HEC Overview Overview Who
We Are Who
We Are
Egalité des chances Egalité des chances Career
Center Career
International International Campus
Life Campus
Sustainability Sustainability Diversity
& Inclusion Diversity
& Inclusion
Stories Stories The HEC
Foundation The HEC
Coronavirus Coronavirus
Faculty & Research Overview Overview Faculty Directory Faculty Directory Departments Departments Centers Centers Chairs Chairs Knowledge Knowledge Master’s programs Master in
Management Master in
MSc International
Finance MSc International
Masters Specialized
programs X-HEC
programs Dual-Degree
students Visiting
Certificates Certificates Student
Life Student
Stories Student
MBA Programs MBA MBA Executive MBA Executive MBA TRIUM EMBA TRIUM EMBA PhD Program Overview Overview HEC Difference HEC Difference Program details Program details Research areas Research areas HEC Community HEC Community Placement Placement Job Market Job Market Admissions Admissions Financing Financing Executive Education Executive Masters Executive Masters Executive Certificates Executive Certificates Executive short programs Executive short programs Online Online Companies Companies Executive MBA Executive MBA Summer School Youth Programs Youth Programs Summer programs Summer programs HEC Online Overview Overview Degree Program Degree Program Executive certificates Executive certificates MOOCs MOOCs Summer Programs Summer Programs


“A $%^* Sexist Program”: Detecting and Addressing AI Bias

Artificial Intelligence
Published on:
Updated on:
August 16th, 2021
7 minutes

A major issue facing companies that use AI, algorithmic bias can perpetuate social inequalities — as well as pose legal and reputational risks to the companies in question. New research at HEC Paris offers a statistical method of tracking down and eliminating unfairness.

facial recognition - cover

©metamorworks on Adobe Stock

Listen to the podcast:

External content from has been blocked.


Soon after Apple issued its Apple credit card, in August 2019, urgent questions arose. A well-known software developer and author, David Heinemeier Hansson, reported in a Tweet that both he and his wife had applied for the card. “My wife and I filed joint tax returns, live in a community-property state, and have been married for a long time,” Hansson wrote. “Yet Apple’s black box algorithm thinks I deserve 20x the credit limit she does.” He called it a “sexist program,” adding an expletive for good measure.


AI based credit scoring - Siberian-Art-AdobeStock
"If a credit-scoring algorithm is trained on a biased dataset of past decisions by humans, the algorithm would inherit and perpetuate human biases." (Photo ©Siberian Art on Adobe Stock)


Goldman Sachs, the issuing bank for the credit card, defended it, saying that the AI algorithm used to make determinations of creditworthiness didn’t even take gender into account. Sounds convincing, except that this ignored the fact that even if specific gender information is removed, algorithms may still use inputs that correlate with gender (“proxies” for gender, such as where a person shops) and thus may still produce unintended cases of bias.

Even Apple’s cofounder Steve Wozniak reported that he and his wife had experienced this bias. Wozniak was judged worthy of 10 times more credit than his wife, despite the fact that he and his wife shared assets and bank accounts. The ensuing melee resulted in an investigation of the Apple Card’s algorithm by New York regulators.

Biased data leads to biased results 

AI/machine learning can process larger quantities of data more efficiently than humans. If applied properly, AI has the potential to eliminate discrimination against certain societal groups. However, in reality, cases of algorithmic bias are not uncommon, as seen in the case of Apple, above.


If a credit-scoring algorithm is trained on a biased dataset of past decisions by humans, the algorithm would inherit and perpetuate human biases.


The reasons for this bias are various. If, for example, a credit-scoring algorithm is trained on a biased dataset of past decisions by humans (racist or sexist credit officers, for example), the algorithm would inherit and perpetuate human biases. Because AI uses thousands of data points and obscure methods of decision making (sometimes described as a black box), the algorithmic biases may be entirely unintended and go undetected.


"When machine learning techniques, which are often difficult to interpret, are poorly applied, they can generate unintended, unseen bias toward entire populations." (Photo ©Nuthawut on Adobe Stock)


In credit markets — the focus of our work — this lack of fairness can place groups that are underprivileged (because of their gender, race, citizenship or religion) at a systemic disadvantage. Certain groups could be unreasonably denied loans, or offered loans at unfavorable interest rates — or given low credit limits. A lack of fairness may also expose the financial institutions using these algorithms to legal and reputational risk. 

A “traffic light” test for detecting unfair algorithms

My fellow researchers, Christophe Hurlin and Sebastien Saurin, and I established a statistics-based definition of fairness as well as a way to test for it. To ensure fairness, decisions made by an algorithm should be driven only by those attributes that are related to the target variables, such as employment duration or credit history, but should be independent of gender, for example. Using statistical theory, we derived a formula to compute fairness statistics as well as the theoretical threshold above which a decision would be considered fair. 


We established a statistics-based definition of fairness as well as a way to test for it.

When dealing with an actual algorithm, one can first compute the fairness statistics and compare them to the theoretical value or threshold. It is then possible to conclude whether an algorithm is “green” (when the fairness statistics are greater than our established threshold) or “red” (when the fairness statistics are less than the threshold). 

Second, if there is a problem, we offer techniques to detect the variables creating the problem — even if the algorithm’s processes are impenetrable. To do so, we developed new AI explainability tools. Third, we suggest ways to mitigate the problem by removing the offending variables.



We developed new AI explainability tools to detect the variables creating the problem of unfairness.


From a purely practical, business perspective, it is important that banks understand the implications — and potential unintended consequences — of the technology they are using. They may risk running afoul of both the justice system and public opinion — and it goes without saying that reputation and trust are key in the banking industry.

Application across diverse fields

While our focus has been on credit scoring, our methodology could potentially be applied in many other contexts in which machine learning algorithms are employed, such as predictive justice (sentencing, probation), hiring decisions (screening of applicants’ CVs and videos), fraud detection and pricing of insurance policies.


"Our methodology could potentially be applied in many contexts in which machine learning algorithms are employed, such as predictive justice, hiring decisions, fraud detection and pricing of insurance policies." (Photo ©artinspiring on Adobe Stock)


The use of machine learning technology raises many ethical, legal and regulatory questions. When machine learning techniques, which are often difficult to interpret, are poorly applied, they can generate unintended, unseen bias toward entire populations on the basis of ethnic, religious, sexual, racial or social criteria. The opportunities and risks that come with machine learning techniques undoubtedly call for the implementation of a new form of regulation based on the certification of the algorithms and data used by companies and institutions.


Focus - Application pour les marques
In the short term, we aim to help companies and institutions that use AI to better understand the decisions of their algorithms and to detect potential unintended consequences. In the longer term, we hope to contribute to the discussion about guidelines, standards and regulations that public administrators should institute.


Drawing on work conducted over 15 years on risk model validation, we developed new statistical tests that detect a lack of fairness. This “traffic light” test statistically analyzes whether an algorithm’s decisions are fair (“green”) or unfair (“red”) against protected societal groups. If an algorithm’s decisions are found to be unfair, we suggest techniques to identify the variables responsible for the bias and to mitigate them.
Based on an interview with Christophe Pérignon and his academic article “The Fairness of Credit Scoring Models,” co-written with Christophe Hurlin and Sébastien Saurin, both from the University of Orléans.

Related content on Artificial Intelligence