Related papers: Detecting Compromised Implicit Association Test Re…
The Implicit Association Test, IAT, is widely used to measure hidden (subconscious) human biases, implicit bias, of many topics: race, gender, age, ethnicity, religion stereotypes. There is a need to understand the reliability of these…
Due to the implement of guardrails by developers, Large language models (LLMs) have demonstrated exceptional performance in explicit bias tests. However, bias in LLMs may occur not only explicitly, but also implicitly, much like humans who…
Opinion polls have now become a very important component of society because they are now a defacto component of our daily news cycle and because their results influence governments and business in ways which are not always obvious to us.…
Large language models (LLMs) can pass explicit social bias tests but still harbor implicit biases, similar to humans who endorse egalitarian beliefs yet exhibit subtle biases. Measuring such implicit biases can be a challenge: as LLMs…
Reliance on stereotypes is a persistent feature of human decision-making and has been extensively documented in educational settings, where it can shape students' confidence, performance, and long-term human capital accumulation. While…
As Large language models (LLMs) become increasingly integrated into our lives, their inherent social biases remain a pressing concern. Detecting and evaluating these biases can be challenging because they are often implicit rather than…
This paper investigates the subtle and often concealed biases present in Large Language Models (LLMs), focusing on implicit biases that may remain despite passing explicit bias tests. Implicit biases are significant because they influence…
Implicit biases refer to automatic mental processes that shape perceptions, judgments, and behaviors. Previous research on "implicit bias" in LLMs focused primarily on outputs rather than the processes underlying the outputs. We present the…
Objective. We establish a principled method for inferring mental health related psychometric variables from neural and behavioral data using the Implicit Association Test (IAT) as the data generation engine, aiming to overcome the limited…
Aspect-based sentiment analysis aims to identify the sentiment polarity of a specific aspect in product reviews. We notice that about 30% of reviews do not contain obvious opinion words, but still convey clear human-aware sentiment…
Unbiased data collection is essential to guaranteeing fairness in artificial intelligence models. Implicit bias, a form of behavioral conditioning that leads us to attribute predetermined characteristics to members of certain groups and…
The Implicit Association Test (IAT) is a common behavioral paradigm to assess implicit attitudes in various research contexts. In recent years, researchers have sought to collect IAT data remotely using online applications. Compared to…
Language is a popular resource to mine speakers' attitude bias, supposing that speakers' statements represent their bias on concepts. However, psychology studies show that people's explicit bias in statements can be different from their…
Drawing on constructs from psychology, prior work has identified a distinction between explicit and implicit bias in large language models (LLMs). While many LLMs undergo post-training alignment and safety procedures to avoid expressions of…
Theory of Mind (ToM) in Large Language Models (LLMs) refers to the model's ability to infer the mental states of others, with failures in this ability often manifesting as systemic implicit biases. Assessing this challenge is difficult, as…
We study misspecified Bayesian learning in principal-agent relationships, where an agent is assessed by an evaluator and rewarded by the market. The agent's outcome depends on their innate ability, costly effort -- whose effectiveness is…
Linear probes are a promising approach for monitoring AI systems for deceptive behaviour. Previous work has shown that a linear classifier trained on a contrastive instruction pair and a simple dataset can achieve good performance. However,…
Semi-supervised learning is an important and active topic of research in pattern recognition. For classification using linear discriminant analysis specifically, several semi-supervised variants have been proposed. Using any one of these…
While various approaches have recently been studied for bias identification, little is known about how implicit language that does not explicitly convey a viewpoint affects bias amplification in large language models. To examine the…
Semi-supervised learning is a setting in which one has labeled and unlabeled data available. In this survey we explore different types of theoretical results when one uses unlabeled data in classification and regression tasks. Most methods…