Related papers: Sever: A Robust Meta-Algorithm for Stochastic Opti…

Reducing the Cost of Training Security Classifier (via Optimized Semi-Supervised Learning)

Background: Most of the existing machine learning models for security tasks, such as spam detection, malware detection, or network intrusion detection, are built on supervised machine learning algorithms. In such a paradigm, models need a…

Cryptography and Security · Computer Science 2022-05-03 Rui Shu , Tianpei Xia , Huy Tu , Laurie Williams , Tim Menzies

Robust SVD Made Easy: A fast and reliable algorithm for large-scale data analysis

The singular value decomposition (SVD) is a crucial tool in machine learning and statistical data analysis. However, it is highly susceptible to outliers in the data matrix. Existing robust SVD algorithms often sacrifice speed for…

Machine Learning · Statistics 2024-02-16 Sangil Han , Kyoowon Kim , Sungkyu Jung

Outlier Detection by Consistent Data Selection Method

Often the challenge associated with tasks like fraud and spam detection[1] is the lack of all likely patterns needed to train suitable supervised learning models. In order to overcome this limitation, such tasks are attempted as outlier or…

Machine Learning · Computer Science 2018-08-22 Utkarsh Porwal , Smruthi Mukund

Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling

Large Language Models (LLMs) are typically evaluated for safety under single-shot or low-budget adversarial prompting, which underestimates real-world risk. In practice, attackers can exploit large-scale parallel sampling to repeatedly…

Artificial Intelligence · Computer Science 2026-02-10 Mingqian Feng , Xiaodong Liu , Weiwei Yang , Chenliang Xu , Christopher White , Jianfeng Gao

SOBER: Highly Parallel Bayesian Optimization and Bayesian Quadrature over Discrete and Mixed Spaces

Batch Bayesian optimisation and Bayesian quadrature have been shown to be sample-efficient methods of performing optimisation and quadrature where expensive-to-evaluate objective functions can be queried in parallel. However, current…

Machine Learning · Computer Science 2023-07-06 Masaki Adachi , Satoshi Hayakawa , Saad Hamid , Martin Jørgensen , Harald Oberhauser , Micheal A. Osborne

Robust and Fully-Dynamic Coreset for Continuous-and-Bounded Learning (With Outliers) Problems

In many machine learning tasks, a common approach for dealing with large-scale data is to build a small summary, {\em e.g.,} coreset, that can efficiently represent the original input. However, real-world datasets usually contain outliers…

Machine Learning · Computer Science 2022-01-24 Zixiu Wang , Yiwen Guo , Hu Ding

Efficient Low-Rank Semidefinite Programming with Robust Loss Functions

In real-world applications, it is important for machine learning algorithms to be robust against data outliers or corruptions. In this paper, we focus on improving the robustness of a large class of learning algorithms that are formulated…

Machine Learning · Computer Science 2021-06-04 Quanming Yao , Hangsi Yang , En-Liang Hu , James Kwok

A Self-scaled Approximate $\ell_0$ Regularization Robust Model for Outlier Detection

Robust regression models in the presence of outliers have significant practical relevance in areas such as signal processing, financial econometrics, and energy management. Many existing robust regression methods, either grounded in…

Signal Processing · Electrical Eng. & Systems 2025-06-30 Pengyang Song , Jue Wang

Meta-Semi: A Meta-learning Approach for Semi-supervised Learning

Deep learning based semi-supervised learning (SSL) algorithms have led to promising results in recent years. However, they tend to introduce multiple tunable hyper-parameters, making them less practical in real SSL scenarios where the…

Machine Learning · Computer Science 2024-10-30 Yulin Wang , Jiayi Guo , Shiji Song , Gao Huang

Combating Adversarial Misspellings with Robust Word Recognition

To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classifier. Our word recognition models build upon the RNN semi-character architecture, introducing several new backoff…

Computation and Language · Computer Science 2019-08-30 Danish Pruthi , Bhuwan Dhingra , Zachary C. Lipton

SSB: Simple but Strong Baseline for Boosting Performance of Open-Set Semi-Supervised Learning

Semi-supervised learning (SSL) methods effectively leverage unlabeled data to improve model generalization. However, SSL models often underperform in open-set scenarios, where unlabeled data contain outliers from novel categories that do…

Computer Vision and Pattern Recognition · Computer Science 2023-11-20 Yue Fan , Anna Kukleva , Dengxin Dai , Bernt Schiele

Breakdown Point of Robust Support Vector Machine

The support vector machine (SVM) is one of the most successful learning methods for solving classification problems. Despite its popularity, SVM has a serious drawback, that is sensitivity to outliers in training samples. The penalty on…

Machine Learning · Statistics 2014-09-04 Takafumi Kanamori , Shuhei Fujiwara , Akiko Takeda

Efficient Algorithms for Outlier-Robust Regression

We give the first polynomial-time algorithm for performing linear or polynomial regression resilient to adversarial corruptions in both examples and labels. Given a sufficiently large (polynomial-size) training set drawn i.i.d. from…

Machine Learning · Computer Science 2020-06-05 Adam Klivans , Pravesh K. Kothari , Raghu Meka

FADER: Fast Adversarial Example Rejection

Deep neural networks are vulnerable to adversarial examples, i.e., carefully-crafted inputs that mislead classification at test time. Recent defenses have been shown to improve adversarial robustness by detecting anomalous deviations from…

Machine Learning · Computer Science 2020-10-20 Francesco Crecchi , Marco Melis , Angelo Sotgiu , Davide Bacciu , Battista Biggio

Can we achieve robustness from data alone?

We introduce a meta-learning algorithm for adversarially robust classification. The proposed method tries to be as model agnostic as possible and optimizes a dataset prior to its deployment in a machine learning system, aiming to…

Machine Learning · Computer Science 2023-02-01 Nikolaos Tsilivis , Jingtong Su , Julia Kempe

A Robust Regression Approach for Robot Model Learning

Machine learning and data analysis have been used in many robotics fields, especially for modelling. Data are usually the result of sensor measurements and, as such, they might be subjected to noise and outliers. The presence of outliers…

Robotics · Computer Science 2019-08-26 Francesco Cursi , Guang-Zhong Yang

A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space Reduction in AutoML

AutoML platforms have numerous options for the algorithms to try for each step of the analysis, i.e., different possible algorithms for imputation, transformations, feature selection, and modelling. Finding the optimal combination of…

Machine Learning · Computer Science 2024-01-19 Giorgos Borboudakis , Paulos Charonyktakis , Konstantinos Paraschakis , Ioannis Tsamardinos

Robust Regression via Model Based Methods

The mean squared error loss is widely used in many applications, including auto-encoders, multi-target regression, and matrix factorization, to name a few. Despite computational advantages due to its differentiability, it is not robust to…

Machine Learning · Computer Science 2021-07-01 Armin Moharrer , Khashayar Kamran , Edmund Yeh , Stratis Ioannidis

Outlier-robust Mean Estimation near the Breakdown Point via Sum-of-Squares

We revisit the problem of estimating the mean of a high-dimensional distribution in the presence of an $\varepsilon$-fraction of adversarial outliers. When $\varepsilon$ is at most some sufficiently small constant, previous works can…

Data Structures and Algorithms · Computer Science 2024-11-22 Hongjie Chen , Deepak Narayanan Sridharan , David Steurer

Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances

Solving a linear system $Ax=b$ is a fundamental scientific computing primitive for which numerous solvers and preconditioners have been developed. These come with parameters whose optimal values depend on the system being solved and are…

Machine Learning · Computer Science 2024-05-03 Mikhail Khodak , Edmond Chow , Maria-Florina Balcan , Ameet Talwalkar