Benchmarking Fairness Toolkits for AI Bias Detection and Mitigation

  • Typ: Thesis
  • Zielgruppe: Bachelor
  • Dozent:

    Julia Gutschow

Problem Description

As Artificial Intelligence (AI) models are increasingly deployed in high-stakes applications such as hiring, credit scoring, and law enforcement, concerns about algorithmic bias and fairness have grown. AI systems can unintentionally discriminate against certain demographic groups due to biased training data, model design, or systemic inequalities. To address these challenges, various fairness toolkits (e.g., Fairlearn, AIF360, Themis-ML) have been developed to help detect, quantify, and mitigate bias in machine learning models. However, these toolkits differ in terms of functionality, usability, scope, and effectiveness, making it difficult for practitioners to choose the most suitable approach for a given application.

Goal of the Thesis

The goal of this thesis is to conduct a structured comparative analysis of existing fairness toolkits, evaluating their capabilities in bias detection, fairness metrics, and bias mitigation strategies. The student will apply multiple toolkits to benchmark a dataset commonly used in fairness research (e.g., COMPAS, Adult Income, German Credit) and assess how effectively each framework identifies and addresses bias. The final output will be a comparative report summarizing key insights, strengths, and weaknesses of each toolkit, offering guidance on their applicability for different AI fairness challenges.

Requirements

  • Interest in AI ethics and fairness
  • Basic programming skills in Python and familiarity with machine learning frameworks (e.g., Scikit-learn, TensorFlow, PyTorch).
  • Experience with or willingness to learn how to use fairness toolkits such as Fairlearn, AIF360, or Themis-ML
  • Ability to conduct systematic evaluations, including benchmarking models on datasets

Sources

  • Bantilan, N. (2018). Themis-ml: A fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation. Journal of Technology in Human Services, 36(1), 15-30.
  • Bird, S., Dudík, M., Edgar, R., Horn, B., Lutz, R., Milan, V., ... & Walker, K. (2020). Fairlearn: A toolkit for assessing and improving fairness in AI. Microsoft, Tech. Rep. MSR-TR-2020-32.
  • Hufthammer, K. T., Aasheim, T. H., Ånneland, S., Brynjulfsen, H., & Slavkovik, M. (2020). Bias mitigation with AIF360: A comparative study. In NIKT: Norsk IKT-konferanse for forskning og utdanning 2020. Norsk IKT-konferanse for forskning og utdanning.
  • Lee, M. S. A., & Singh, J. (2021, May). The landscape and gaps in open source fairness toolkits. In Proceedings of the 2021 CHI conference on human factors in computing systems (pp. 1-13).