English
Related papers

Related papers: A Note on Data Biases in Generative Models

200 papers

Deep generative models produce data according to a learned representation, e.g. diffusion models, through a process of approximation computing possible samples. Approximation can be understood as reconstruction and the large datasets used…

Human-Computer Interaction · Computer Science 2023-09-25 Luís Arandas , Mick Grierson , Miguel Carvalhais

Machine learning applications are becoming increasingly pervasive in our society. Since these decision-making systems rely on data-driven learning, risk is that they will systematically spread the bias embedded in data. In this paper, we…

Machine Learning · Statistics 2023-02-09 Alessandro Castelnovo , Riccardo Crupi , Nicole Inverardi , Daniele Regoli , Andrea Cosentini

Predictive models often reinforce biases which were originally embedded in their training data, through skewed decisions. In such cases, mitigation methods are critical to ensure that, regardless of the prevailing disparities, model…

Machine Learning · Statistics 2025-07-15 Ricardo Inácio , Zafeiris Kokkinogenis , Vitor Cerqueira , Carlos Soares

Learning with limited data is one of the biggest problems of machine learning. Current approaches to this issue consist in learning general representations from huge amounts of data before fine-tuning the model on a small dataset of…

Machine Learning · Computer Science 2023-02-22 Grégoire Mialon

Bias is known to be an impediment to fair decisions in many domains such as human resources, the public sector, health care etc. Recently, hope has been expressed that the use of machine learning methods for taking such decisions would…

Machine Learning · Computer Science 2019-09-05 Jindong Gu , Daniela Oelke

Models that are learned from real-world data are often biased because the data used to train them is biased. This can propagate systemic human biases that exist and ultimately lead to inequitable treatment of people, especially minorities.…

Computer Vision and Pattern Recognition · Computer Science 2019-07-01 Daniel McDuff , Shuang Ma , Yale Song , Ashish Kapoor

With the advent of generative modeling techniques, synthetic data and its use has penetrated across various domains from unstructured data such as image, text to structured dataset modeling healthcare outcome, risk decisioning in financial…

Machine Learning · Computer Science 2021-05-11 Aman Gupta , Deepak Bhatt , Anubha Pandey

The unparalleled ability of machine learning algorithms to learn patterns from data also enables them to incorporate biases embedded within. A biased model can then make decisions that disproportionately harm certain groups in society. Much…

Machine Learning · Computer Science 2022-06-28 José Pombal , Pedro Saleiro , Mário A. T. Figueiredo , Pedro Bizarro

Real-world datasets are often biased with respect to key demographic factors such as race and gender. Due to the latent nature of the underlying factors, detecting and mitigating bias is especially challenging for unsupervised machine…

Machine Learning · Computer Science 2020-07-01 Kristy Choi , Aditya Grover , Trisha Singh , Rui Shu , Stefano Ermon

Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications. However, since they are highly data-driven, relying on billion-sized datasets randomly…

Predictive algorithms have a powerful potential to offer benefits in areas as varied as medicine or education. However, these algorithms and the data they use are built by humans, consequently, they can inherit the bias and prejudices…

Human-Computer Interaction · Computer Science 2022-03-22 Cristina Manresa-Yee , Silvia Ramis

As the demand for high-quality training data escalates, researchers have increasingly turned to generative models to create synthetic data, addressing data scarcity and enabling continuous model improvement. However, reliance on…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Zeliang Zhang , Xin Liang , Mingqian Feng , Susan Liang , Chenliang Xu

Synthetic data is emerging as a substitute for authentic data to solve ethical and legal challenges in handling authentic face data. The current models can create real-looking face images of people who do not exist. However, it is a known…

Computer Vision and Pattern Recognition · Computer Science 2023-11-08 Marco Huber , Anh Thi Luu , Fadi Boutros , Arjan Kuijper , Naser Damer

In high dimensional settings, density estimation algorithms rely crucially on their inductive bias. Despite recent empirical success, the inductive bias of deep generative models is not well understood. In this paper we propose a framework…

Machine Learning · Computer Science 2018-11-09 Shengjia Zhao , Hongyu Ren , Arianna Yuan , Jiaming Song , Noah Goodman , Stefano Ermon

How do we learn from biased data? Historical datasets often reflect historical prejudices; sensitive or protected attributes may affect the observed treatments and outcomes. Classification algorithms tasked with predicting outcomes…

Machine Learning · Computer Science 2018-12-04 David Madras , Elliot Creager , Toniann Pitassi , Richard Zemel

The widespread adoption of generative AI models has raised growing concerns about representational harm and potential discriminatory outcomes. Yet, despite growing literature on this topic, the mechanisms by which bias emerges - especially…

Computer Vision and Pattern Recognition · Computer Science 2025-06-12 Xiaofeng Zhang , Michelle Lin , Simon Lacoste-Julien , Aaron Courville , Yash Goyal

In this paper, we propose a new framework for mitigating biases in machine learning systems. The problem of the existing mitigation approaches is that they are model-oriented in the sense that they focus on tuning the training algorithms to…

Machine Learning · Computer Science 2019-05-27 Adel Abusitta , Esma Aïmeur , Omar Abdel Wahab

It is widely recognized that deep neural networks are sensitive to bias in the data. This means that during training these models are likely to learn spurious correlations between data and labels, resulting in limited generalization…

Machine Learning · Computer Science 2024-12-06 Vito Paolo Pastore , Massimiliano Ciranni , Davide Marinelli , Francesca Odone , Vittorio Murino

Large-scale behavioral datasets enable researchers to use complex machine learning algorithms to better predict human behavior, yet this increased predictive power does not always lead to a better understanding of the behavior in question.…

Computers and Society · Computer Science 2019-05-14 Mayank Agrawal , Joshua C. Peterson , Thomas L. Griffiths

An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training data, which provide poor coverage over real…

Computer Vision and Pattern Recognition · Computer Science 2020-02-18 Ali Jahanian , Lucy Chai , Phillip Isola
‹ Prev 1 2 3 10 Next ›