Debiasing Algorithms: Fair Machine Learning

I have been recently interested in the topic of Algorithmic Bias. This page presents a few ideas I am working on in collaboration with the KPN ICAI Lab for Responsible AI.

A Fair Machine Learning Approach Against Discriminatory Outcomes

As machines are trained to analyze complex problems, many tasks that previously required human intelligence are now either assisted or fully automated through Artificial Intelligence (AI). Increasingly, Machine Learning algorithms are used to predict behaviors and classify individuals. They promise to improve decision-making across many domains (public funding, justice, education, healthcare, finance, supply chain management, communication, marketing, and human resources) by considering thousands of factors and learning meaningful relationships from historical evidence. At the same time, society has come to realize that algorithms are not flawless. The number of cases reporting bias(es) in algorithms has exploded over the past years. Everywhere, concerns are rising that AI-generated decisions may lead to discriminatory actions against protected groups. Often, algorithms reproduce or even amplify biases present in human decisions. In some cases, they even inadvertently create new discriminatory outcomes.

Together with a team of multidisciplinary experts, we are working on developing a holistic and versatile framework to enhance fairness in causal machine learning. The approach enables the elaboration of fair and effective policies. At its core, it designs a fair loss function that controls the accuracy-fairness trade-off. The approach prevents the insemination of biases at all stages of the machine learning loop: (1) during data acquisition using active learning, (2) during estimation and optimization using a penalized loss function, and (3) during evaluation using offline evaluation of fairness in a causal context. Explainable AI tools will keep track of potential biases during the process. Our aim is to test this new technology with a set of external partners across industries. Once available, codes and tutorials will be made available following Open Science standards.

If you are interested in collaborating with us, please let us know: lemmens [at] rsm . nl



As machines are trained to analyze complex problems, many tasks that previously required human intelligence are now either assisted or fully automated through Artificial Intelligence (AI). Supervised machine learning is one of the most important applications of modern AI. Machine learning (ML) algorithms for regression and classification can jointly consider thousands of variables and identify complex correlations in data to accurately predict outcomes. As a result, they promise to improve decision-making across many public and private domains: public funding (eligibility for pensions, unemployment benefits), criminal justice (fraud detection, face recognition), education (college admission), healthcare (patient prioritization, vaccine innovation), finance (lending decisions), information and communication media (search engines, news recommenders), marketing (pricing, targeting), and human resources (employee selection and evaluation). Lately, a breakthrough in machine learning has shifted the field’s scope from correlation discovery (supervised learning) to causal machine learning by combining state-of-the-art in machine learning and economics (Wager and Athey 2018). Causal ML answers questions that supervised learning was not able to solve. Namely, how to optimally allocate a (scarce) resource to individuals for whom the causal effect of an intervention is the highest? (Athey, 2017). To this end, causal ML estimates an intervention’s Conditional Average Treatment Effect (CATE) for the population’s subgroups. It provides inputs for optimizing personalized policies (e.g., personalized medicine) and leads to better decision-making than supervised ML (Lemmens and Gupta, 2020). These types of algorithms are the focus of this research program.

Even though AI can become a force for good (Taddeo and Floridi, 2018), AI-driven decisions can have far-reaching effects on people (Bohren et al., 2019). In particular, there have been rising concerns about whether decisions guided by AI may lead to discriminatory actions against protected groups, i.e., groups characterized by protected attributes (e.g., gender, ethnicity, income, age, sexual orientation, or religion).  Many examples have made the news in recent years:

  • In the public sector, SyRi, the system risk indicator of the Dutch Ministry of Social Affairs, was flagging individuals in low-income areas as more likely to fraud than in high-income areas (Burack, 2020), leading to the resignation of the entire cabinet;
  • In the education sector, the French post-bac admission algorithm (APB) was terminated by President Macron in 2017 because of lack of transparency and bias against disadvantaged students (Frouillou, 2016);
  • In the financial sector, Apple’s new credit card was declining more often credit lines to women than men, despite better credit scores (Telford, 2019);
  • In the human resources sector, Facebook displayed science, technology, engineering, and math (STEM) career ads more often to young men than young women despite equal qualifications (Lambrecht and Tucker, 2019).

Addressing algorithmic bias has become a top priority for governments. The European Commission has recently produced ‘Ethics Guidelines for Trustworthy AI’ (European Commission, 2019), in the same vein as the OECD. International human rights organizations recognize algorithmic biases as potential dangers, and the UN has made it clear that algorithmic biases go against many of the Sustainable Development Goals (e.g., on financial inclusion). Tackling all these issues necessitates a comprehensive set of actions from all stakeholders involved: public policymakers, including new legal and regulatory measures (e.g., Goodman and Flaxman, 2017) and private stakeholders whose decisions rely on algorithms (Taddeo and Floridi, 2016). For them, a critical challenge is thinking beyond the objective of maximizing a policy’s effectiveness and incorporating considerations about algorithms’ societal impact when building them. Maximizing the fairness of policies should become an objective too when building AI systems. The current research program focuses on this goal.


The concept of algorithmic bias refers to the systematic preferential or discriminatory treatment of a group of people by algorithms (Barocas et al., 2021). At the core of the problem, algorithms often discriminate against protected groups. Based on this definition of algorithmic bias, an algorithm is fair if it does not discriminate against specific groups. The literature proposes many definitions of fairness. Barocas et al. (2021) propose three categories of fairness criteria (see Narayanan, 2018, for an extensive overview of all criteria):

  • Independence: the protected attribute is statistically independent of the predicted value. For instance, the statistical parity criterion says that predictions should be independent of a protected attribute.
  • Separation: the protected attribute is statistically independent of the predicted value conditional on the actual outcome’s value. For instance, the equal opportunity criterion says that protected and non-protected groups’ behavior should be predicted with the same accuracy level (e.g., same True Positive rates).
  • Sufficiency: the protected attribute is statistically independent of the actual outcome value conditional on the predicted value. For instance, the calibration criterion says that protected and non-protected groups who received the same predicted value should have the same actual outcome’s value.

These criteria have in common that they focus on the degree to which predictions of a given outcome (e.g., whether citizens commit fraud) are biased towards certain groups (e.g., low-income neighborhoods). For instance, statistical parity arises if individuals in low-income communities have an equal probability of being predicted as delinquent as those in high-income areas. Equal opportunity means that the model is equally good at predicting fraudsters in low-income as in high-income neighborhoods. Calibration says that, among the predicted fraudsters, there should be an equal share of citizens from low and high-income neighborhoods. From this perspective, they are appropriate criteria to test for algorithmic biases in the context of supervised ML. However, they are not adequate to assess the fairness of causal ML algorithms. The following example illustrates the main problem. Suppose a causal ML algorithm prioritizes patients’ access to ICU beds based on their predicted change in life expectancy due to the treatment. Most likely, the treatment effect distribution does not fully correlate with the life expectancy distribution in the absence of the treatment. As a result, assessing whether patients’ predicted life expectancy is shorter when they belong to a protected group (statistical parity) is not a good test of the causal algorithm’s fairness. Likewise, accurately predicting the life expectancy of the protected group (equal opportunity) is also insufficient. Instead, one needs to ascertain that the priority ranking based on the treatment effectiveness for each patient is not biased towards specific groups.

At this stage, it is essential to realize that the challenge is complex because of the fundamental problem of causal inference (Rubin, 1974): we can never observe both counterfactuals (e.g., expected lifetime if a patient was admitted to the ICU and if a patient was not). Put differently, the ground truth of the causal effect is missing. Among other complexities, modeling fairness of causal models is also challenging because protected attributes might affect the outcome(s) and the treatment(s). Madras et al. (2019) set out to propose the first framework to conceptualize this complex problem. However, despite its importance for designing policies, the fairness of causal models has received little attention thus far.

Our Approach to Fair Machine Learning: 360 solution

  • Data Acquisition with Fair Active Learning
  • Penalized Causal Machine Learning for Model Training
  • Fair Offline Evaluation for Optimization

Our main ambition is to develop a conceptual and methodological framework that minimizes the risk that causal ML algorithms used for policy optimization generate unfair policies. At the core of this framework lies the design of a fair loss function for causal models. Loss functions are the backbone of any ML procedure. They directly impact the predictions and can change their distribution significantly. Intuitively, a loss function tells the machine what objective it should achieve with the data at hand. Despite their importance in guiding the algorithm, loss functions are often misaligned with the decision maker’s objectives, leading to suboptimal decisions (Lemmens and Gupta, 2020). The novelty of this research program is to leverage the impact of the loss function to improve algorithmic fairness. The fair loss function will control the trade-off between maximizing the effectiveness of a policy and its fairness. The proposed framework method has the following unique advantages w.r.t prior work:

  • The approach offers a framework and tools to elaborate fair and effective policies, which is conceptually and technically different and more complex than generating fair predictions of a given outcome. The goal is to deliver fair predictions of conditional average treatment effects (CATE) using the potential outcome framework for causal inference (Rubin, 2005. Among other insights, it will develop causal fairness criteria that are different from the classic fairness criteria for supervised ML.
  • The approach is versatile. In particular, decision-makers can (a) use the fairness definition of their choice (this feature is essential as algorithmic biases can take many forms) and (b) choose their favorite ML model (different models will be tested and compared to each other).
  • The approach is holistic. The literature identified three primary sources of algorithmic bias, which relate to the stages of the ML loop: (1) during data acquisition, (2) during model estimation, and (3) during model evaluation (Barocas et al., 2021). Rather than focusing on one stage, the project addresses the risk of bias insemination or propagation at all To do so, it relies on three domains of ML: (a) active learning, the process of acquiring data via ‘smart’ experiments; (b) causal ML, the estimation of treatment effects for policy optimization; (c) offline evaluation, the process of evaluating policies before implementation.

Our approach is articulated according to these three stages (see Figure 1). First, we propose an active learning algorithm that ensures that the selected experimental units (i.e., individuals) allow for a fair representation of protected groups and a fair estimation of their CATE. Second, we develop a fair ensemble ML model using gradient boosting (see e.g., Lemmens and Croux, 2006) in a causal framework. The loss function will penalize more heavily the experimental units that are more susceptible to discrimination. Intuitively, the penalization forces the algorithm to make accurate predictions for the protected groups. Such a weighting scheme ensures a fair prediction of their treatment effect. Third, we propose a novel offline evaluation procedure to evaluate whether the counterfactual predictions generated by algorithms discriminate specific groups and minimize the risk of biases in the further actions taken based on these algorithms. The evaluation stage will be followed by a feedback loop. Controlling the bias insemination at all stages will lead to better and more transparent algorithms. Explainable AI tools will keep track of potential biases.

More soon…