Related papers: A Non-Intrusive Correction Algorithm for Classific…

Corrective Machine Unlearning

Machine Learning models increasingly face data integrity challenges due to the use of large-scale training datasets drawn from the Internet. We study what model developers can do if they detect that some data was manipulated or incorrect.…

Machine Learning · Computer Science 2024-10-18 Shashwat Goel , Ameya Prabhu , Philip Torr , Ponnurangam Kumaraguru , Amartya Sanyal

Learning Deep Neural Networks under Agnostic Corrupted Supervision

Training deep neural models in the presence of corrupted supervision is challenging as the corrupted data points may significantly impact the generalization performance. To alleviate this problem, we present an efficient robust algorithm…

Machine Learning · Computer Science 2021-02-16 Boyang Liu , Mengying Sun , Ding Wang , Pang-Ning Tan , Jiayu Zhou

Computationally Tractable Algorithms for Finding a Subset of Non-defective Items from a Large Population

In the classical non-adaptive group testing setup, pools of items are tested together, and the main goal of a recovery algorithm is to identify the "complete defective set" given the outcomes of different group tests. In contrast, the main…

Information Theory · Computer Science 2016-03-01 Abhay Sharma , Chandra R. Murthy

A SMART Stochastic Algorithm for Nonconvex Optimization with Applications to Robust Machine Learning

In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed…

Machine Learning · Statistics 2017-02-07 Aleksandr Aravkin , Damek Davis

Navigating Data Corruption in Machine Learning: Balancing Quality, Quantity, and Imputation Strategies

Data corruption, including missing and noisy data, poses significant challenges in real-world machine learning. This study investigates the effects of data corruption on model performance and explores strategies to mitigate these effects…

Machine Learning · Computer Science 2025-05-22 Qi Liu , Wanjing Ma

Non-Parametric Calibration for Classification

Many applications of classification methods not only require high accuracy but also reliable estimation of predictive uncertainty. However, while many current classification frameworks, in particular deep neural networks, achieve high…

Machine Learning · Computer Science 2020-02-28 Jonathan Wenger , Hedvig Kjellström , Rudolph Triebel

Non-intrusive model reduction of large-scale, nonlinear dynamical systems using deep learning

Projection-based model reduction has become a popular approach to reduce the cost associated with integrating large-scale dynamical systems so they can be used in many-query settings such as optimization and uncertainty quantification. For…

Numerical Analysis · Mathematics 2020-08-26 Han Gao , Jian-Xun Wang , Matthew J. Zahr

Learning with Bad Training Data via Iterative Trimmed Loss Minimization

In this paper, we study a simple and generic framework to tackle the problem of learning model parameters when a fraction of the training samples are corrupted. We first make a simple observation: in a variety of such settings, the…

Machine Learning · Computer Science 2019-02-20 Yanyao Shen , Sujay Sanghavi

Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy

This paper proposes a novel non-intrusive system failure prediction technique using available information from developers and minimal information from raw logs (rather than mining entire logs) but keeping the data entirely private with the…

Artificial Intelligence · Computer Science 2024-09-20 Dibakar Das , Vikram Seshasai , Vineet Sudhir Bhat , Pushkal Juneja , Jyotsna Bapat , Debabrata Das

Rule Mining for Correcting Classification Models

Machine learning models need to be continually updated or corrected to ensure that the prediction accuracy remains consistently high. In this study, we consider scenarios where developers should be careful to change the prediction results…

Software Engineering · Computer Science 2023-10-17 Hirofumi Suzuki , Hiroaki Iwashita , Takuya Takagi , Yuta Fujishige , Satoshi Hara

Model Repair: Robust Recovery of Over-Parameterized Statistical Models

A new type of robust estimation problem is introduced where the goal is to recover a statistical model that has been corrupted after it has been estimated from data. Methods are proposed for "repairing" the model using only the design and…

Statistics Theory · Mathematics 2020-05-21 Chao Gao , John Lafferty

Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders

Parametric and non-parametric classifiers often have to deal with real-world data, where corruptions like noise, occlusions, and blur are unavoidable - posing significant challenges. We present a probabilistic approach to classify strongly…

Machine Learning · Computer Science 2023-04-24 Philipp Joppich , Sebastian Dorn , Oliver De Candido , Wolfgang Utschick , Jakob Knollmüller

Preventing the Generation of Inconsistent Sets of Classification Rules

In recent years, the interest in interpretable classification models has grown. One of the proposed ways to improve the interpretability of a rule-based classification model is to use sets (unordered collections) of rules, instead of lists…

Machine Learning · Computer Science 2020-03-31 Thiago Zafalon Miranda , Diorge Brognara Sardinha , Ricardo Cerri

Missing Data Imputation for Supervised Learning

Missing data imputation can help improve the performance of prediction models in situations where missing data hide useful information. This paper compares methods for imputing missing categorical data for supervised classification tasks.…

Machine Learning · Statistics 2020-08-11 Jason Poulos , Rafael Valle

Machine Learning (ML) based Reduced Order Modeling (ROM) for linear and non-linear solid and structural mechanics

Multiple model reduction techniques have been proposed to tackle linear and non linear problems. Intrusive model order reduction techniques exhibit high accuracy levels, however, they are rarely used as a standalone industrial tool, because…

Computational Engineering, Finance, and Science · Computer Science 2025-04-10 Mikhael Tannous , Chady Ghnatios , Eivind Fonn , Trond Kvamsdal , Francisco Chinesta

Better Multi-class Probability Estimates for Small Data Sets

Many classification applications require accurate probability estimates in addition to good class separation but often classifiers are designed focusing only on the latter. Calibration is the process of improving probability estimates by…

Machine Learning · Computer Science 2020-01-31 Tuomo Alasalmi , Jaakko Suutala , Heli Koskimäki , Juha Röning

Learning with Monotone Adversarial Corruptions

We study the extent to which standard machine learning algorithms rely on exchangeability and independence of data by introducing a monotone adversarial corruption model. In this model, an adversary, upon looking at a "clean" i.i.d.…

Machine Learning · Computer Science 2026-01-06 Kasper Green Larsen , Chirag Pabbaraju , Abhishek Shetty

Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms

A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as…

Machine Learning · Computer Science 2020-02-21 Pietro Barbiero , Giovanni Squillero , Alberto Tonda

Summarization and Classification of Non-Poisson Point Processes

Fitting models for non-Poisson point processes is complicated by the lack of tractable models for much of the data. By using large samples of independent and identically distributed realizations and statistical learning, it is possible to…

Methodology · Statistics 2007-12-04 Jeffrey Picka , Mingxia Deng

Machine learning with incomplete datasets using multi-objective optimization models

Machine learning techniques have been developed to learn from complete data. When missing values exist in a dataset, the incomplete data should be preprocessed separately by removing data points with missing values or imputation. In this…

Machine Learning · Computer Science 2020-12-25 Hadi A. Khorshidi , Michael Kirley , Uwe Aickelin