Related papers: Sever: A Robust Meta-Algorithm for Stochastic Opti…
Background: Most of the existing machine learning models for security tasks, such as spam detection, malware detection, or network intrusion detection, are built on supervised machine learning algorithms. In such a paradigm, models need a…
The singular value decomposition (SVD) is a crucial tool in machine learning and statistical data analysis. However, it is highly susceptible to outliers in the data matrix. Existing robust SVD algorithms often sacrifice speed for…
Often the challenge associated with tasks like fraud and spam detection[1] is the lack of all likely patterns needed to train suitable supervised learning models. In order to overcome this limitation, such tasks are attempted as outlier or…
Large Language Models (LLMs) are typically evaluated for safety under single-shot or low-budget adversarial prompting, which underestimates real-world risk. In practice, attackers can exploit large-scale parallel sampling to repeatedly…
Batch Bayesian optimisation and Bayesian quadrature have been shown to be sample-efficient methods of performing optimisation and quadrature where expensive-to-evaluate objective functions can be queried in parallel. However, current…
In many machine learning tasks, a common approach for dealing with large-scale data is to build a small summary, {\em e.g.,} coreset, that can efficiently represent the original input. However, real-world datasets usually contain outliers…
In real-world applications, it is important for machine learning algorithms to be robust against data outliers or corruptions. In this paper, we focus on improving the robustness of a large class of learning algorithms that are formulated…
Robust regression models in the presence of outliers have significant practical relevance in areas such as signal processing, financial econometrics, and energy management. Many existing robust regression methods, either grounded in…
Deep learning based semi-supervised learning (SSL) algorithms have led to promising results in recent years. However, they tend to introduce multiple tunable hyper-parameters, making them less practical in real SSL scenarios where the…
To combat adversarial spelling mistakes, we propose placing a word recognition model in front of the downstream classifier. Our word recognition models build upon the RNN semi-character architecture, introducing several new backoff…
Semi-supervised learning (SSL) methods effectively leverage unlabeled data to improve model generalization. However, SSL models often underperform in open-set scenarios, where unlabeled data contain outliers from novel categories that do…
The support vector machine (SVM) is one of the most successful learning methods for solving classification problems. Despite its popularity, SVM has a serious drawback, that is sensitivity to outliers in training samples. The penalty on…
We give the first polynomial-time algorithm for performing linear or polynomial regression resilient to adversarial corruptions in both examples and labels. Given a sufficiently large (polynomial-size) training set drawn i.i.d. from…
Deep neural networks are vulnerable to adversarial examples, i.e., carefully-crafted inputs that mislead classification at test time. Recent defenses have been shown to improve adversarial robustness by detecting anomalous deviations from…
We introduce a meta-learning algorithm for adversarially robust classification. The proposed method tries to be as model agnostic as possible and optimizes a dataset prior to its deployment in a machine learning system, aiming to…
Machine learning and data analysis have been used in many robotics fields, especially for modelling. Data are usually the result of sensor measurements and, as such, they might be subjected to noise and outliers. The presence of outliers…
AutoML platforms have numerous options for the algorithms to try for each step of the analysis, i.e., different possible algorithms for imputation, transformations, feature selection, and modelling. Finding the optimal combination of…
The mean squared error loss is widely used in many applications, including auto-encoders, multi-target regression, and matrix factorization, to name a few. Despite computational advantages due to its differentiability, it is not robust to…
We revisit the problem of estimating the mean of a high-dimensional distribution in the presence of an $\varepsilon$-fraction of adversarial outliers. When $\varepsilon$ is at most some sufficiently small constant, previous works can…
Solving a linear system $Ax=b$ is a fundamental scientific computing primitive for which numerous solvers and preconditioners have been developed. These come with parameters whose optimal values depend on the system being solved and are…