Related papers: Stabilizing Linear Prediction Models using Autoenc…
We study the problem of linear feature selection when features are highly correlated. Such settings pose two fundamental challenges. First, how should model similarity be defined? Simply counting features in common can be misleading: two…
The reliability assessment of a machine learning model's prediction is an important quantity for the deployment in safety critical applications. Not only can it be used to detect novel sceneries, either as out-of-distribution or anomaly…
Many modern datasets don't fit neatly into $n \times p$ matrices, but most techniques for measuring statistical stability expect rectangular data. We study methods for stability assessment on non-rectangular data, using statistical learning…
Stability in clinical prediction models is crucial for transferability between studies, yet has received little attention. The problem is paramount in high dimensional data which invites sparse models with feature selection capability. We…
Fitting models with high predictive accuracy that include all relevant but no irrelevant or redundant features is a challenging task on data sets with similar (e.g. highly correlated) features. We propose the approach of tuning the…
In this paper, we study the "stability" of machine learning (ML) models within the context of larger, complex NLP systems with continuous training data updates. For this study, we propose a methodology for the assessment of model stability…
Updating machine learning models with new information usually improves their predictive performance, yet, in many applications, it is also desirable to avoid changing the model predictions too much. This property is called stability. In…
We consider regression in which one predicts a response $Y$ with a set of predictors $X$ across different experiments or environments. This is a common setup in many data-driven scientific fields and we argue that statistical inference can…
A new Bayesian approach to linear system identification has been proposed in a series of recent papers. The main idea is to frame linear system identification as predictor estimation in an infinite dimensional space, with the aid of…
We consider the problem of learning linear prediction models with model misspecification bias. In such case, the collinearity among input variables may inflate the error of parameter estimation, resulting in instability of prediction…
Preference learning in large language models relies on reward models as proxies for human judgment. However, these models frequently exhibit preference instability, producing contradictory preference assignments in response to subtle,…
We consider the problem of stabilization of a linear system, under state and control constraints, and subject to bounded disturbances and unknown parameters in the state matrix. First, using a simple least square solution and available…
Wide accessibility of imaging and profile sensors in modern industrial systems created an abundance of high-dimensional sensing variables. This led to a a growing interest in the research of high-dimensional process monitoring. However,…
In health related machine learning applications, the training data often corresponds to a non-representative sample from the target populations where the learners will be deployed. In anticausal prediction tasks, selection biases often make…
Stability selection is a widely adopted resampling-based framework for high-dimensional variable selection. This paper seeks to broaden the use of an established stability estimator to evaluate the overall stability of the stability…
The solution of linear inverse problems arising, for example, in signal and image processing is a challenging problem since the ill-conditioning amplifies, in the solution, the noise present in the data. Recently introduced algorithms based…
Penalized regression models are popularly used in high-dimensional data analysis to conduct variable selection and model fitting simultaneously. Whereas success has been widely reported in literature, their performances largely depend on…
A convolutional autoencoder is trained using a database of airfoil aerodynamic simulations and assessed in terms of overall accuracy and interpretability. The goal is to predict the stall and to investigate the ability of the autoencoder to…
The impact of machine learning models on healthcare will depend on the degree of trust that healthcare professionals place in the predictions made by these models. In this paper, we present a method to provide people with clinical expertise…
Despite decades of research, SE lacks widely accepted models (that offer precise quantitative stable predictions) about what factors most influence software quality. This paper provides a promising result showing such stable models can be…