Related papers: Detecting Gender Stereotypes in Scratch Programmin…

Gendered Prompting and LLM Code Review: How Gender Cues in the Prompt Shape Code Quality and Evaluation

LLMs are increasingly embedded in programming workflows, from code generation to automated code review. Yet, how gendered communication styles interact with LLM-assisted programming and code review remains underexplored. We present a…

Software Engineering · Computer Science 2026-03-26 Lynn Janzen , Üveys Eroglu , Dorothea Kolossa , Pia Knöferle , Sebastian Möller , Vera Schmitt , Veronika Solopova

Evaluation of Large Language Models: STEM education and Gender Stereotypes

Large Language Models (LLMs) have an increasing impact on our lives with use cases such as chatbots, study support, coding support, ideation, writing assistance, and more. Previous studies have revealed linguistic biases in pronouns used to…

Computation and Language · Computer Science 2024-06-17 Smilla Due , Sneha Das , Marianne Andersen , Berta Plandolit López , Sniff Andersen Nexø , Line Clemmensen

Investigating Gender Bias in LLM-Generated Stories via Psychological Stereotypes

As Large Language Models (LLMs) are increasingly used across different applications, concerns about their potential to amplify gender biases in various tasks are rising. Prior research has often probed gender bias using explicit gender cues…

Computation and Language · Computer Science 2025-08-06 Shahed Masoudian , Gustavo Escobedo , Hannah Strauss , Markus Schedl

Code Perfumes: Reporting Good Code to Encourage Learners

Block-based programming languages like Scratch enable children to be creative while learning to program. Even though the block-based approach simplifies the creation of programs, learning to program can nevertheless be challenging.…

Software Engineering · Computer Science 2021-08-16 Florian Obermüller , Lena Bloch , Luisa Greifenstein , Ute Heuer , Gordon Fraser

Gender bias and stereotypes in Large Language Models

Large Language Models (LLMs) have made substantial progress in the past several months, shattering state-of-the-art benchmarks in many domains. This paper investigates LLMs' behavior with respect to gender stereotypes, a known issue for…

Computation and Language · Computer Science 2023-08-30 Hadas Kotek , Rikker Dockum , David Q. Sun

A Comprehensive Study of Implicit and Explicit Biases in Large Language Models

Large Language Models (LLMs) inherit explicit and implicit biases from their training datasets. Identifying and mitigating biases in LLMs is crucial to ensure fair outputs, as they can perpetuate harmful stereotypes and misinformation. This…

Machine Learning · Computer Science 2025-11-19 Fatima Kazi , Alex Young , Yash Inani , Setareh Rafatirad

LLMs Reproduce Stereotypes of Sexual and Gender Minorities

A large body of research has found substantial gender bias in NLP systems. Most of this research takes a binary, essentialist view of gender: limiting its variation to the categories _men_ and _women_, conflating gender with sex, and…

Computation and Language · Computer Science 2025-09-25 Ruby Ostrow , Adam Lopez

The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data

With the wide and cross-domain adoption of Large Language Models, it becomes crucial to assess to which extent the statistical correlations in training data, which underlie their impressive performance, hide subtle and potentially troubling…

Artificial Intelligence · Computer Science 2025-12-12 Massimiliano Luca , Ciro Beneduce , Bruno Lepri , Jacopo Staiano

Measuring Bias or Measuring the Task: Understanding the Brittle Nature of LLM Gender Biases

As LLMs are increasingly applied in socially impactful settings, concerns about gender bias have prompted growing efforts both to measure and mitigate such bias. These efforts often rely on evaluation tasks that differ from natural language…

Computation and Language · Computer Science 2025-09-11 Bufan Gao , Elisa Kreiss

Empirical Investigation of the Relationship Between Design Smells and Role Stereotypes

During software development, poor design and implementation choices can detrimentally impact software maintainability. Design smells, recurring patterns of poorly designed fragments, signify these issues. Role-stereotypes denote the generic…

Software Engineering · Computer Science 2024-06-28 Daniel Ogenrwot , Joyce Nakatumba-Nabende , John Businge , Michel R. V. Chaudron

More Women, Same Stereotypes: Unpacking the Gender Bias Paradox in Large Language Models

Large Language Models (LLMs) have revolutionized natural language processing, yet concerns persist regarding their tendency to reflect or amplify social biases. This study introduces a novel evaluation framework to uncover gender biases in…

Computation and Language · Computer Science 2026-03-10 Evan Chen , Run-Jun Zhan , Yan-Bai Lin , Hung-Hsuan Chen

Bias Testing and Mitigation in LLM-based Code Generation

As the adoption of LLMs becomes more widespread in software coding ecosystems, a pressing issue has emerged: does the generated code contain social bias and unfairness, such as those related to age, gender, and race? This issue concerns the…

Software Engineering · Computer Science 2025-03-24 Dong Huang , Jie M. Zhang , Qingwen Bu , Xiaofei Xie , Junjie Chen , Heming Cui

Addressing Stereotypes in Large Language Models: A Critical Examination and Mitigation

Large Language models (LLMs), such as ChatGPT, have gained popularity in recent years with the advancement of Natural Language Processing (NLP), with use cases spanning many disciplines and daily lives as well. LLMs inherit explicit and…

Computation and Language · Computer Science 2025-12-01 Fatima Kazi

Bias, Accuracy, and Trust: Gender-Diverse Perspectives on Large Language Models

Large language models (LLMs) are becoming increasingly ubiquitous in our daily lives, but numerous concerns about bias in LLMs exist. This study examines how gender-diverse populations perceive bias, accuracy, and trustworthiness in LLMs,…

Human-Computer Interaction · Computer Science 2025-07-09 Aimen Gaba , Emily Wall , Tejas Ramkumar Babu , Yuriy Brun , Kyle Hall , Cindy Xiong Bearfield

Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

As teachers increasingly turn to GenAI in their educational practice, we need robust methods to benchmark large language models (LLMs) for pedagogical purposes. This article presents an embedding-based benchmarking framework to detect bias…

Computation and Language · Computer Science 2026-04-02 Yishan Du , Conrad Borchers , Mutlu Cukurova

Unraveling Downstream Gender Bias from Large Language Models: A Study on AI Educational Writing Assistance

Large Language Models (LLMs) are increasingly utilized in educational tasks such as providing writing suggestions to students. Despite their potential, LLMs are known to harbor inherent biases which may negatively impact learners. Previous…

Computation and Language · Computer Science 2023-11-07 Thiemo Wambsganss , Xiaotian Su , Vinitra Swamy , Seyed Parsa Neshaei , Roman Rietsche , Tanja Käser

Towards Auditing Large Language Models: Improving Text-based Stereotype Detection

Large Language Models (LLM) have made significant advances in the recent past becoming more mainstream in Artificial Intelligence (AI) enabled human-facing applications. However, LLMs often generate stereotypical output inherited from…

Computation and Language · Computer Science 2023-11-27 Wu Zekun , Sahan Bulathwela , Adriano Soares Koshiyama

Stereotype Detection in LLMs: A Multiclass, Explainable, and Benchmark-Driven Approach

Stereotype detection is a challenging and subjective task, as certain statements, such as "Black people like to play basketball," may not appear overtly toxic but still reinforce racial stereotypes. With the increasing prevalence of large…

Computation and Language · Computer Science 2024-11-19 Zekun Wu , Sahan Bulathwela , Maria Perez-Ortiz , Adriano Soares Koshiyama

Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings

Large language models (LLMs) are the foundation of the current successes of artificial intelligence (AI), however, they are unavoidably biased. To effectively communicate the risks and encourage mitigation efforts these models need adequate…

Computation and Language · Computer Science 2025-01-14 Carolin M. Schuster , Maria-Alexandra Dinisor , Shashwat Ghatiwala , Georg Groh

Assessing Gender Bias in LLMs: Comparing LLM Outputs with Human Perceptions and Official Statistics

This study investigates gender bias in large language models (LLMs) by comparing their gender perception to that of human respondents, U.S. Bureau of Labor Statistics data, and a 50% no-bias benchmark. We created a new evaluation set using…

Computation and Language · Computer Science 2024-11-22 Tetiana Bas