Related papers: Quantifying Stereotypes in Language

StereoSet: Measuring stereotypical bias in pretrained language models

A stereotype is an over-generalized belief about a particular group of people, e.g., Asians are good at math or Asians are bad drivers. Such beliefs (biases) are known to hurt target groups. Since pretrained language models are trained on…

Computation and Language · Computer Science 2020-04-21 Moin Nadeem , Anna Bethke , Siva Reddy

Detecting Linguistic Indicators for Stereotype Assessment with Large Language Models

Social categories and stereotypes are embedded in language and can introduce data bias into Large Language Models (LLMs). Despite safeguards, these biases often persist in model behavior, potentially leading to representational harm in…

Computation and Language · Computer Science 2025-02-27 Rebekka Görge , Michael Mock , Héctor Allende-Cid

Hate Speech Classifiers Learn Human-Like Social Stereotypes

Social stereotypes negatively impact individuals' judgements about different groups and may have a critical role in how people understand language directed toward minority social groups. Here, we assess the role of social stereotypes in the…

Computation and Language · Computer Science 2021-10-29 Aida Mostafazadeh Davani , Mohammad Atari , Brendan Kennedy , Morteza Dehghani

Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you?

In this paper, we investigate what types of stereotypical information are captured by pretrained language models. We present the first dataset comprising stereotypical attributes of a range of social groups and propose a method to elicit…

Computation and Language · Computer Science 2021-09-22 Rochelle Choenni , Ekaterina Shutova , Robert van Rooij

Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models

This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models when tackling the WinoBias pronoun resolution task. We find evidence that gender stereotype…

Computation and Language · Computer Science 2021-02-17 Daniel de Vassimon Manela , David Errington , Thomas Fisher , Boris van Breugel , Pasquale Minervini

Language, communication and society: a gender based linguistics analysis

The purpose of this study is to find evidence for supporting the hypothesis that language is the mirror of our thinking, our prejudices and cultural stereotypes. In this analysis, a questionnaire was administered to 537 people. The answers…

Computation and Language · Computer Science 2020-07-15 P. Cutugno , D. Chiarella , R. Lucentini , L. Marconi , G. Morgavi

A Survey on Stereotype Detection in Natural Language Processing

Stereotypes influence social perceptions and can escalate into discrimination and violence. While NLP research has extensively addressed gender bias and hate speech, stereotype detection remains an emerging field with significant societal…

Computation and Language · Computer Science 2025-10-08 Alessandra Teresa Cignarella , Anastasia Giachanou , Els Lefever

Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models

Large language models (LLMs) have been shown to propagate and amplify harmful stereotypes, particularly those that disproportionately affect marginalised communities. To understand the effect of these stereotypes more comprehensively, we…

Computation and Language · Computer Science 2024-10-10 Zara Siddique , Liam D. Turner , Luis Espinosa-Anke

Counteracts: Testing Stereotypical Representation in Pre-trained Language Models

Recently, language models have demonstrated strong performance on various natural language understanding tasks. Language models trained on large human-generated corpus encode not only a significant amount of human knowledge, but also the…

Computation and Language · Computer Science 2023-04-10 Damin Zhang , Julia Rayz , Romila Pradhan

Towards Auditing Large Language Models: Improving Text-based Stereotype Detection

Large Language Models (LLM) have made significant advances in the recent past becoming more mainstream in Artificial Intelligence (AI) enabled human-facing applications. However, LLMs often generate stereotypical output inherited from…

Computation and Language · Computer Science 2023-11-27 Wu Zekun , Sahan Bulathwela , Adriano Soares Koshiyama

Understanding and Countering Stereotypes: A Computational Approach to the Stereotype Content Model

Stereotypical language expresses widely-held beliefs about different social categories. Many stereotypes are overtly negative, while others may appear positive on the surface, but still lead to negative consequences. In this work, we…

Computers and Society · Computer Science 2021-06-07 Kathleen C. Fraser , Isar Nejadgholi , Svetlana Kiritchenko

Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models

To recognize and mitigate harms from large language models (LLMs), we need to understand the prevalence and nuances of stereotypes in LLM outputs. Toward this end, we present Marked Personas, a prompt-based method to measure stereotypes in…

Computation and Language · Computer Science 2023-05-30 Myra Cheng , Esin Durmus , Dan Jurafsky

Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence Reasoning

Due to their similarity-based learning objectives, pretrained sentence encoders often internalize stereotypical assumptions that reflect the social biases that exist within their training corpora. In this paper, we describe several kinds of…

Computation and Language · Computer Science 2023-03-13 Hongyin Luo , James Glass

Quantifying and Reducing Stereotypes in Word Embeddings

Machine learning algorithms are optimized to model statistical properties of the training data. If the input data reflects stereotypes and biases of the broader society, then the output of the learning algorithm also captures these…

Computation and Language · Computer Science 2016-06-21 Tolga Bolukbasi , Kai-Wei Chang , James Zou , Venkatesh Saligrama , Adam Kalai

On The Role of Reasoning in the Identification of Subtle Stereotypes in Natural Language

Large language models (LLMs) are trained on vast, uncurated datasets that contain various forms of biases and language reinforcing harmful stereotypes that may be subsequently inherited by the models themselves. Therefore, it is essential…

Computation and Language · Computer Science 2024-10-01 Jacob-Junqi Tian , Omkar Dige , D. B. Emerson , Faiza Khan Khattak

Measuring Stereotype and Deviation Biases in Large Language Models

Large language models (LLMs) are widely applied across diverse domains, raising concerns about their limitations and potential risks. In this study, we investigate two types of bias that LLMs may display: stereotype bias and deviation bias.…

Computation and Language · Computer Science 2026-05-20 Daniel Wang , Eli Brignac , Minjia Mao , Xiao Fang

A Proposal for Linguistic Similarity Datasets Based on Commonality Lists

Similarity is a core notion that is used in psychology and two branches of linguistics: theoretical and computational. The similarity datasets that come from the two fields differ in design: psychological datasets are focused around a…

Computation and Language · Computer Science 2016-06-20 Dmitrijs Milajevs , Sascha Griffiths

Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings

Large language models (LLMs) are the foundation of the current successes of artificial intelligence (AI), however, they are unavoidably biased. To effectively communicate the risks and encourage mitigation efforts these models need adequate…

Computation and Language · Computer Science 2025-01-14 Carolin M. Schuster , Maria-Alexandra Dinisor , Shashwat Ghatiwala , Georg Groh

Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets

Accurately measuring gender stereotypical bias in language models is a complex task with many hidden aspects. Current benchmarks have underestimated this multifaceted challenge and failed to capture the full extent of the problem. This…

Computation and Language · Computer Science 2025-09-25 Mahdi Zakizadeh , Mohammad Taher Pilehvar

CO-STAR: Conceptualisation of Stereotypes for Analysis and Reasoning

Warning: this paper contains material which may be offensive or upsetting. While much of recent work has focused on the detection of hate speech and overtly offensive content, very little research has explored the more subtle but equally…

Computation and Language · Computer Science 2021-12-03 Teyun Kwon , Anandha Gopalan