WI3 Teaching - Thesis & Seminar Offerings - Limitations and Biases of Large Language Models: Identifying, Measuring, and Counteracting Gender Stereotypes and Fairness Issues

Limitations and Biases of Large Language Models: Identifying, Measuring, and Counteracting Gender Stereotypes and Fairness Issues

Typ: Thesis
Zielgruppe: Master
Dozent:
Dr. Pascal Heßler

Problem Description

Large Language Models (LLMs) like GPT-4 have revolutionized natural language processing. However, they are not free from limitations and biases, including the reinforcement of gender stereotypes (e.g., associating "CEO" primarily with white males) and fairness issues in their responses. These biases can result in outputs that perpetuate discrimination or marginalize underrepresented groups. Addressing these challenges is essential to ensure ethical AI development and deployment.

https://www.youtube.com/shorts/nMVswCXOtAI

Goal of the Thesis

This thesis examines the limitations and biases inherent in LLMs, focusing on gender stereotypes and fairness. The research will investigate methods for measuring such biases (e.g., stereotype association tests) and explore strategies to mitigate these biases, such as improved dataset curation, bias-aware training methodologies, or post-processing techniques.

Requirements

Familiarity with concepts of fairness in AI and interest in bias mitigation techniques.

Sources

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? . Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/10.1145/3442188.3445922
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Advances in Neural Information Processing Systems, 29, 4349–4357.
Binns, R. (2018). Fairness in Machine Learning: Lessons from Political Philosophy. Proceedings of the 2018 Conference on Fairness, Accountability, and Transparency, 149–159.
Devinney, H., Björklund, J., & Björklund, H. (2024). We Don’t Talk About That: Case Studies on Intersectional Analysis of Social Bias in Large Language Models. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), 33–44. https://doi.org/10.18653/v1/2024.gebnlp-1.3
Gupta, M., Parra, C. M., & Dennehy, D. (2022). Questioning Racial and Gender Bias in AI-based Recommendations: Do Espoused National Cultural Values Matter? Information Systems Frontiers, 24(5), 1465–1481. https://doi.org/10.1007/s10796-021-10156-2
You, Z., Lee, H., Mishra, S., Jeoung, S., Mishra, A., Kim, J., & Diesner, J. (2024). Beyond Binary Gender Labels: Revealing Gender Bias in LLMs through Gender-Neutral Name Predictions. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), 255–268. https://doi.org/10.18653/v1/2024.gebnlp-1.16