English
Related papers

Related papers: An information-theoretic learning model based on i…

200 papers

A current assumption of most clustering methods is that the training data and future data are taken from the same distribution. However, this assumption may not hold in most real-world scenarios. In this paper, we propose an information…

Machine Learning · Statistics 2023-05-31 Jiangshe Zhang , Lizhen Ji , Meng Wang

Given a task of predicting $Y$ from $X$, a loss function $L$, and a set of probability distributions $\Gamma$ on $(X,Y)$, what is the optimal decision rule minimizing the worst-case expected loss over $\Gamma$? In this paper, we address…

Machine Learning · Statistics 2017-07-05 Farzan Farnia , David Tse

Empirical risk minimization stands behind most optimization in supervised machine learning. Under this scheme, labeled data is used to approximate an expected cost (risk), and a learning algorithm updates model-defining parameters in search…

Machine Learning · Statistics 2023-05-25 James Schmidt

The empirical risk minimization approach to data-driven decision making requires access to training data drawn under the same conditions as those that will be faced when the decision rule is deployed. However, in a number of settings, we…

Methodology · Statistics 2025-09-17 Roshni Sahoo , Lihua Lei , Stefan Wager

Machine learning models have traditionally been developed under the assumption that the training and test distributions match exactly. However, recent success in few-shot learning and related problems are encouraging signs that these models…

Machine Learning · Statistics 2020-10-15 James Lucas , Mengye Ren , Irene Kameni , Toniann Pitassi , Richard Zemel

We consider statistical learning problems, when the distribution $P'$ of the training observations $Z'_1,\; \ldots,\; Z'_n$ differs from the distribution $P$ involved in the risk one seeks to minimize (referred to as the test distribution)…

Machine Learning · Statistics 2020-02-20 Robin Vogel , Mastane Achab , Stéphan Clémençon , Charles Tillier

This paper investigates, from information theoretic grounds, a learning problem based on the principle that any regularity in a given dataset can be exploited to extract compact features from data, i.e., using fewer bits than needed to…

Machine Learning · Statistics 2018-11-14 Matías Vera , Leonardo Rey Vega , Pablo Piantanida

How can we effectively remove or ''unlearn'' undesirable information, such as specific features or the influence of individual data points, from a learning outcome while minimizing utility loss and ensuring rigorous guarantees? We introduce…

Machine Learning · Computer Science 2025-12-30 Shizhou Xu , Thomas Strohmer

Exponential models of distributions are widely used in machine learning for classiffication and modelling. It is well known that they can be interpreted as maximum entropy models under empirical expectation constraints. In this work, we…

Machine Learning · Computer Science 2012-07-19 Amir Globerson , Naftali Tishby

Feature selection is one of the most fundamental problems in machine learning. An extensive body of work on information-theoretic feature selection exists which is based on maximizing mutual information between subsets of features and class…

Machine Learning · Statistics 2016-06-10 Shuyang Gao , Greg Ver Steeg , Aram Galstyan

The goal of regression and classification methods in supervised learning is to minimize the empirical risk, that is, the expectation of some loss function quantifying the prediction error under the empirical distribution. When facing scarce…

Optimization and Control · Mathematics 2019-07-15 Soroosh Shafieezadeh-Abadeh , Daniel Kuhn , Peyman Mohajerin Esfahani

The concept of a minimax classifier is well-established in statistical decision theory, but its implementation via neural networks remains challenging, particularly in scenarios with imbalanced training data having a limited number of…

Machine Learning · Computer Science 2026-01-07 Hansung Choi , Daewon Seo

Driven by applications in telecommunication networks, we explore the simulation task of estimating rare event probabilities for tandem queues in their steady state. Existing literature has recognized that importance sampling methods can be…

Machine Learning · Computer Science 2025-04-22 Ruoning Zhao , Xinyun Chen

Many machine learning models appear to deploy effortlessly under distribution shift, and perform well on a target distribution that is considerably different from the training distribution. Yet, learning theory of distribution shift bounds…

Machine Learning · Computer Science 2024-05-30 Robi Bhattacharjee , Nick Rittler , Kamalika Chaudhuri

Importance sampling approximates expectations with respect to a target measure by using samples from a proposal measure. The performance of the method over large classes of test functions depends heavily on the closeness between both…

Computation · Statistics 2016-09-01 Daniel Sanz-Alonso

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top…

Machine Learning · Statistics 2020-03-31 Martin Arjovsky , Léon Bottou , Ishaan Gulrajani , David Lopez-Paz

A central challenge to applying many off-policy reinforcement learning algorithms to real world problems is the variance introduced by importance sampling. In off-policy learning, the agent learns about a different policy than the one being…

Machine Learning · Computer Science 2022-06-20 Eric Graves , Sina Ghiassian

We study problem-dependent rates, i.e., generalization errors that scale near-optimally with the variance, the effective loss, or the gradient norms evaluated at the "best hypothesis." We introduce a principled framework dubbed "uniform…

Machine Learning · Statistics 2020-12-25 Yunbei Xu , Assaf Zeevi

We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set $\mathcal{G}$ up to the smallest possible additive term, called the convergence rate. When the…

Statistics Theory · Mathematics 2009-09-09 Jean-Yves Audibert

In this paper, we study a simple and generic framework to tackle the problem of learning model parameters when a fraction of the training samples are corrupted. We first make a simple observation: in a variety of such settings, the…

Machine Learning · Computer Science 2019-02-20 Yanyao Shen , Sujay Sanghavi
‹ Prev 1 2 3 10 Next ›