Related papers: A Note on Data Biases in Generative Models

Antagonising explanation and revealing bias directly through sequencing and multimodal inference

Deep generative models produce data according to a learned representation, e.g. diffusion models, through a process of approximation computing possible samples. Approximation can be understood as reconstruction and the large datasets used…

Human-Computer Interaction · Computer Science 2023-09-25 Luís Arandas , Mick Grierson , Miguel Carvalhais

Investigating Bias with a Synthetic Data Generator: Empirical Evidence and Philosophical Interpretation

Machine learning applications are becoming increasingly pervasive in our society. Since these decision-making systems rely on data-driven learning, risk is that they will systematically spread the bias embedded in data. In this paper, we…

Machine Learning · Statistics 2023-02-09 Alessandro Castelnovo , Riccardo Crupi , Nicole Inverardi , Daniele Regoli , Andrea Cosentini

Simulating Biases for Interpretable Fairness in Offline and Online Classifiers

Predictive models often reinforce biases which were originally embedded in their training data, through skewed decisions. In such cases, mitigation methods are critical to ensure that, regardless of the prevailing disparities, model…

Machine Learning · Statistics 2025-07-15 Ricardo Inácio , Zafeiris Kokkinogenis , Vitor Cerqueira , Carlos Soares

On Inductive Biases for Machine Learning in Data Constrained Settings

Learning with limited data is one of the biggest problems of machine learning. Current approaches to this issue consist in learning general representations from huge amounts of data before fine-tuning the model on a small dataset of…

Machine Learning · Computer Science 2023-02-22 Grégoire Mialon

Understanding Bias in Machine Learning

Bias is known to be an impediment to fair decisions in many domains such as human resources, the public sector, health care etc. Recently, hope has been expressed that the use of machine learning methods for taking such decisions would…

Machine Learning · Computer Science 2019-09-05 Jindong Gu , Daniela Oelke

Characterizing Bias in Classifiers using Generative Models

Models that are learned from real-world data are often biased because the data used to train them is biased. This can propagate systemic human biases that exist and ultimately lead to inequitable treatment of people, especially minorities.…

Computer Vision and Pattern Recognition · Computer Science 2019-07-01 Daniel McDuff , Shuang Ma , Yale Song , Ashish Kapoor

Transitioning from Real to Synthetic data: Quantifying the bias in model

With the advent of generative modeling techniques, synthetic data and its use has penetrated across various domains from unstructured data such as image, text to structured dataset modeling healthcare outcome, risk decisioning in financial…

Machine Learning · Computer Science 2021-05-11 Aman Gupta , Deepak Bhatt , Anubha Pandey

Prisoners of Their Own Devices: How Models Induce Data Bias in Performative Prediction

The unparalleled ability of machine learning algorithms to learn patterns from data also enables them to incorporate biases embedded within. A biased model can then make decisions that disproportionately harm certain groups in society. Much…

Machine Learning · Computer Science 2022-06-28 José Pombal , Pedro Saleiro , Mário A. T. Figueiredo , Pedro Bizarro

Fair Generative Modeling via Weak Supervision

Real-world datasets are often biased with respect to key demographic factors such as race and gender. Due to the latent nature of the underlying factors, detecting and mitigating bias is especially challenging for unsupervised machine…

Machine Learning · Computer Science 2020-07-01 Kristy Choi , Aditya Grover , Trisha Singh , Rui Shu , Stefano Ermon

Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness

Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications. However, since they are highly data-driven, relying on billion-sized datasets randomly…

Machine Learning · Computer Science 2023-07-18 Felix Friedrich , Manuel Brack , Lukas Struppek , Dominik Hintersdorf , Patrick Schramowski , Sasha Luccioni , Kristian Kersting

Assessing Gender Bias in Predictive Algorithms using eXplainable AI

Predictive algorithms have a powerful potential to offer benefits in areas as varied as medicine or education. However, these algorithms and the data they use are built by humans, consequently, they can inherit the bias and prejudices…

Human-Computer Interaction · Computer Science 2022-03-22 Cristina Manresa-Yee , Silvia Ramis

Will the Inclusion of Generated Data Amplify Bias Across Generations in Future Image Classification Models?

As the demand for high-quality training data escalates, researchers have increasingly turned to generative models to create synthetic data, addressing data scarcity and enabling continuous model improvement. However, reliance on…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Zeliang Zhang , Xin Liang , Mingqian Feng , Susan Liang , Chenliang Xu

Bias and Diversity in Synthetic-based Face Recognition

Synthetic data is emerging as a substitute for authentic data to solve ethical and legal challenges in handling authentic face data. The current models can create real-looking face images of people who do not exist. However, it is a known…

Computer Vision and Pattern Recognition · Computer Science 2023-11-08 Marco Huber , Anh Thi Luu , Fadi Boutros , Arjan Kuijper , Naser Damer

Bias and Generalization in Deep Generative Models: An Empirical Study

In high dimensional settings, density estimation algorithms rely crucially on their inductive bias. Despite recent empirical success, the inductive bias of deep generative models is not well understood. In this paper we propose a framework…

Machine Learning · Computer Science 2018-11-09 Shengjia Zhao , Hongyu Ren , Arianna Yuan , Jiaming Song , Noah Goodman , Stefano Ermon

Fairness Through Causal Awareness: Learning Latent-Variable Models for Biased Data

How do we learn from biased data? Historical datasets often reflect historical prejudices; sensitive or protected attributes may affect the observed treatments and outcomes. Classification algorithms tasked with predicting outcomes…

Machine Learning · Computer Science 2018-12-04 David Madras , Elliot Creager , Toniann Pitassi , Richard Zemel

Bias Analysis in Unconditional Image Generative Models

The widespread adoption of generative AI models has raised growing concerns about representational harm and potential discriminatory outcomes. Yet, despite growing literature on this topic, the mechanisms by which bias emerges - especially…

Computer Vision and Pattern Recognition · Computer Science 2025-06-12 Xiaofeng Zhang , Michelle Lin , Simon Lacoste-Julien , Aaron Courville , Yash Goyal

Generative Adversarial Networks for Mitigating Biases in Machine Learning Systems

In this paper, we propose a new framework for mitigating biases in machine learning systems. The problem of the existing mitigation approaches is that they are model-oriented in the sense that they focus on tuning the training algorithms to…

Machine Learning · Computer Science 2019-05-27 Adel Abusitta , Esma Aïmeur , Omar Abdel Wahab

Looking at Model Debiasing through the Lens of Anomaly Detection

It is widely recognized that deep neural networks are sensitive to bias in the data. This means that during training these models are likely to learn spurious correlations between data and labels, resulting in limited generalization…

Machine Learning · Computer Science 2024-12-06 Vito Paolo Pastore , Massimiliano Ciranni , Davide Marinelli , Francesca Odone , Vittorio Murino

Using Machine Learning to Guide Cognitive Modeling: A Case Study in Moral Reasoning

Large-scale behavioral datasets enable researchers to use complex machine learning algorithms to better predict human behavior, yet this increased predictive power does not always lead to a better understanding of the behavior in question.…

Computers and Society · Computer Science 2019-05-14 Mayank Agrawal , Joshua C. Peterson , Thomas L. Griffiths

On the "steerability" of generative adversarial networks

An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training data, which provide poor coverage over real…

Computer Vision and Pattern Recognition · Computer Science 2020-02-18 Ali Jahanian , Lucy Chai , Phillip Isola