Related papers: Statistically Testing Training Data for Unwanted E…

A fuzzy-rough uncertainty measure to discover bias encoded explicitly or implicitly in features of structured pattern classification datasets

The need to measure bias encoded in tabular data that are used to solve pattern recognition problems is widely recognized by academia, legislators and enterprises alike. In previous work, we proposed a bias quantification measure, called…

Machine Learning · Computer Science 2022-01-24 Gonzalo Nápoles , Lisa Koutsoviti Koumeri

Data-Driven Fuzzy Modeling Using Deep Learning

Fuzzy modeling has many advantages over the non-fuzzy methods, such as robustness against uncertainties and less sensitivity to the varying dynamics of nonlinear systems. Data-driven fuzzy modeling needs to extract fuzzy rules from the…

Systems and Control · Computer Science 2018-06-08 Erick de la Rosa , Wen Yu

Learning from Imprecise and Fuzzy Observations: Data Disambiguation through Generalized Loss Minimization

Methods for analyzing or learning from "fuzzy data" have attracted increasing attention in recent years. In many cases, however, existing methods (for precise, non-fuzzy data) are extended to the fuzzy case in an ad-hoc manner, and without…

Machine Learning · Computer Science 2017-10-10 Eyke Hüllermeier

Model Debiasing by Learnable Data Augmentation

Deep Neural Networks are well known for efficiently fitting training data, yet experiencing poor generalization capabilities whenever some kind of bias dominates over the actual task labels, resulting in models learning "shortcuts". In…

Machine Learning · Computer Science 2024-08-12 Pietro Morerio , Ruggero Ragonesi , Vittorio Murino

Do Machine Learning Models Learn Statistical Rules Inferred from Data?

Machine learning models can make critical errors that are easily hidden within vast amounts of data. Such errors often run counter to rules based on human intuition. However, rules based on human knowledge are challenging to scale or to…

Machine Learning · Computer Science 2023-06-08 Aaditya Naik , Yinjun Wu , Mayur Naik , Eric Wong

Bayesianize Fuzziness in the Statistical Analysis of Fuzzy Data

Fuzzy data, prevalent in social sciences and other fields, capture uncertainties arising from subjective evaluations and measurement imprecision. Despite significant advancements in fuzzy statistics, a unified inferential regression-based…

Methodology · Statistics 2025-06-05 Antonio Calcagnì , Przemysław Grzegorzewski , Maciej Romaniuk

FR-Train: A Mutual Information-Based Approach to Fair and Robust Training

Trustworthy AI is a critical issue in machine learning where, in addition to training a model that is accurate, one must consider both fair and robust training in the presence of data bias and poisoning. However, the existing model fairness…

Machine Learning · Computer Science 2020-07-06 Yuji Roh , Kangwook Lee , Steven Euijong Whang , Changho Suh

Looking at Model Debiasing through the Lens of Anomaly Detection

It is widely recognized that deep neural networks are sensitive to bias in the data. This means that during training these models are likely to learn spurious correlations between data and labels, resulting in limited generalization…

Machine Learning · Computer Science 2024-12-06 Vito Paolo Pastore , Massimiliano Ciranni , Davide Marinelli , Francesca Odone , Vittorio Murino

Toward More Generalized Malicious URL Detection Models

This paper reveals a data bias issue that can severely affect the performance while conducting a machine learning model for malicious URL detection. We describe how such bias can be identified using interpretable machine learning…

Machine Learning · Computer Science 2024-02-12 YunDa Tsai , Cayon Liow , Yin Sheng Siang , Shou-De Lin

Analyzing Bias in Sensitive Personal Information Used to Train Financial Models

Bias in data can have unintended consequences that propagate to the design, development, and deployment of machine learning models. In the financial services sector, this can result in discrimination from certain financial instruments and…

Cryptography and Security · Computer Science 2019-11-12 Reginald Bryant , Celia Cintas , Isaac Wambugu , Andrew Kinai , Komminist Weldemariam

On Inductive Biases for Machine Learning in Data Constrained Settings

Learning with limited data is one of the biggest problems of machine learning. Current approaches to this issue consist in learning general representations from huge amounts of data before fine-tuning the model on a small dataset of…

Machine Learning · Computer Science 2023-02-22 Grégoire Mialon

On the Reduction of Biases in Big Data Sets for the Detection of Irregular Power Usage

In machine learning, a bias occurs whenever training sets are not representative for the test data, which results in unreliable models. The most common biases in data are arguably class imbalance and covariate shift. In this work, we aim to…

Machine Learning · Computer Science 2018-04-04 Patrick Glauner , Radu State , Petko Valtchev , Diogo Duarte

Statistical Learning from Biased Training Samples

With the deluge of digitized information in the Big Data era, massive datasets are becoming increasingly available for learning predictive models. However, in many practical situations, the poor control of the data acquisition processes may…

Machine Learning · Statistics 2022-11-02 Stephan Clémençon , Pierre Laforgue

Robust Federated Training via Collaborative Machine Teaching using Trusted Instances

Federated learning performs distributed model training using local data hosted by agents. It shares only model parameter updates for iterative aggregation at the server. Although it is privacy-preserving by design, federated learning is…

Machine Learning · Computer Science 2019-05-09 Yufei Han , Xiangliang Zhang

Towards Reliable Testing of Machine Unlearning

Machine learning components are now central to AI-infused software systems, from recommendations and code assistants to clinical decision support. As regulations and governance frameworks increasingly require deleting sensitive data from…

Machine Learning · Computer Science 2026-04-21 Anna Mazhar , Sainyam Galhotra

Simultaneous Improvement of ML Model Fairness and Performance by Identifying Bias in Data

Machine learning models built on datasets containing discriminative instances attributed to various underlying factors result in biased and unfair outcomes. It's a well founded and intuitive fact that existing bias mitigation strategies…

Machine Learning · Computer Science 2022-10-25 Bhushan Chaudhari , Akash Agarwal , Tanmoy Bhowmik

Certifying Data-Bias Robustness in Linear Regression

Datasets typically contain inaccuracies due to human error and societal biases, and these inaccuracies can affect the outcomes of models trained on such datasets. We present a technique for certifying whether linear regression models are…

Machine Learning · Computer Science 2022-06-09 Anna P. Meyer , Aws Albarghouthi , Loris D'Antoni

Corrective Machine Unlearning

Machine Learning models increasingly face data integrity challenges due to the use of large-scale training datasets drawn from the Internet. We study what model developers can do if they detect that some data was manipulated or incorrect.…

Machine Learning · Computer Science 2024-10-18 Shashwat Goel , Ameya Prabhu , Philip Torr , Ponnurangam Kumaraguru , Amartya Sanyal

Learning from a Biased Sample

The empirical risk minimization approach to data-driven decision making requires access to training data drawn under the same conditions as those that will be faced when the decision rule is deployed. However, in a number of settings, we…

Methodology · Statistics 2025-09-17 Roshni Sahoo , Lihua Lei , Stefan Wager

Federated Learning Inspired Fuzzy Systems: Decentralized Rule Updating for Privacy and Scalable Decision Making

Fuzzy systems are a way to allow machines, systems and frameworks to deal with uncertainty, which is not possible in binary systems that most computers use. These systems have already been deployed for certain use cases, and fuzzy systems…

Machine Learning · Computer Science 2025-07-10 Arthur Alexander Lim , Zhen Bin It , Jovan Bowen Heng , Tee Hui Teo