Related papers: On Statistical Efficiency in Learning

Regression Model Selection Under General Conditions

Model selection criteria are one of the most important tools in statistics. Proofs showing a model selection criterion is asymptotically optimal are tailored to the type of model (linear regression, quantile regression, penalized…

Statistics Theory · Mathematics 2025-10-17 Amaze Lusompa

Optimal Learning for Stochastic Optimization with Nonlinear Parametric Belief Models

We consider the problem of estimating the expected value of information (the knowledge gradient) for Bayesian learning problems where the belief model is nonlinear in the parameters. Our goal is to maximize some metric, while simultaneously…

Machine Learning · Statistics 2016-11-23 Xinyu He , Warren B. Powell

Efficient Online Learning for Optimizing Value of Information: Theory and Application to Interactive Troubleshooting

We consider the optimal value of information (VoI) problem, where the goal is to sequentially select a set of tests with a minimal cost, so that one can efficiently make the best decision based on the observed outcomes. Existing algorithms…

Artificial Intelligence · Computer Science 2017-07-18 Yuxin Chen , Jean-Michel Renders , Morteza Haghir Chehreghani , Andreas Krause

The Pivotal Information Criterion

The Bayesian and Akaike information criteria aim at finding a good balance between under- and over-fitting. They are extensively used every day by practitioners. Yet we contend they suffer from at least two afflictions: their penalty…

Statistics Theory · Mathematics 2026-03-20 Sylvain Sardy , Maxime van Cutsem , Sara van de Geer

Information-based inference for singular models and finite sample sizes: A frequentist information criterion

In the information-based paradigm of inference, model selection is performed by selecting the candidate model with the best estimated predictive performance. The success of this approach depends on the accuracy of the estimate of the…

Machine Learning · Statistics 2018-06-11 Colin H. LaMont , Paul A. Wiggins

Batch mode active learning for efficient parameter estimation

For many tasks of data analysis, we may only have the information of the explanatory variable and the evaluation of the response values are quite expensive. While it is impractical or too costly to obtain the responses of all units, a…

Computation · Statistics 2023-04-07 Wei Zheng , Ting Tian , Xueqin Wang

Asymptotic optimality of a cross-validatory predictive approach to linear model selection

In this article we study the asymptotic predictive optimality of a model selection criterion based on the cross-validatory predictive density, already available in the literature. For a dependent variable and associated explanatory…

Statistics Theory · Mathematics 2008-12-18 Arijit Chakrabarti , Tapas Samanta

Towards Accelerated Model Training via Bayesian Data Selection

Mislabeled, duplicated, or biased data in real-world scenarios can lead to prolonged training and even hinder model convergence. Traditional solutions prioritizing easy or hard samples lack the flexibility to handle such a variety…

Machine Learning · Computer Science 2023-11-08 Zhijie Deng , Peng Cui , Jun Zhu

A Robust Consistent Information Criterion for Model Selection based on Empirical Likelihood

Conventional likelihood-based information criteria for model selection rely on the distribution assumption of data. However, for complex data that are increasingly available in many scientific fields, the specification of their underlying…

Methodology · Statistics 2020-06-25 Chixiang Chen , Ming Wang , Rongling Wu , Runze Li

Information bottleneck theory of high-dimensional regression: relevancy, efficiency and optimality

Avoiding overfitting is a central challenge in machine learning, yet many large neural networks readily achieve zero training loss. This puzzling contradiction necessitates new approaches to the study of overfitting. Here we quantify…

Information Theory · Computer Science 2022-10-13 Vudtiwat Ngampruetikorn , David J. Schwab

Annealing Optimization for Progressive Learning with Stochastic Approximation

In this work, we introduce a learning model designed to meet the needs of applications in which computational resources are limited, and robustness and interpretability are prioritized. Learning problems can be formulated as constrained…

Systems and Control · Electrical Eng. & Systems 2025-09-26 Christos Mavridis , John Baras

Decision Making with Side Information and Unbounded Loss Functions

We consider the problem of decision-making with side information and unbounded loss functions. Inspired by probably approximately correct learning model, we use a slightly different model that incorporates the notion of side information in…

Machine Learning · Computer Science 2007-07-13 Majid Fozunbal , Ton Kalker

Equations of States in Statistical Learning for a Nonparametrizable and Regular Case

Many learning machines that have hierarchical structure or hidden variables are now being used in information science, artificial intelligence, and bioinformatics. However, several learning machines used in such fields are not regular but…

Machine Learning · Computer Science 2015-05-13 Sumio Watanabe

The Statistical Complexity of Interactive Decision Making

A fundamental challenge in interactive learning and decision making, ranging from bandit problems to reinforcement learning, is to provide sample-efficient, adaptive learning algorithms that achieve near-optimal regret. This question is…

Machine Learning · Computer Science 2023-07-12 Dylan J. Foster , Sham M. Kakade , Jian Qian , Alexander Rakhlin

Ask for More Than Bayes Optimal: A Theory of Indecisions for Classification

Selective classification is a powerful tool for automated decision-making in high-risk scenarios, allowing classifiers to act only when confident and abstain when uncertainty is high. Given a target accuracy, our goal is to minimize…

Statistics Theory · Mathematics 2025-10-28 Mohamed Ndaoud , Peter Radchenko , Bradley Rava

Model-specific Data Subsampling with Influence Functions

Model selection requires repeatedly evaluating models on a given dataset and measuring their relative performances. In modern applications of machine learning, the models being considered are increasingly more expensive to evaluate and the…

Machine Learning · Computer Science 2020-10-21 Anant Raj , Cameron Musco , Lester Mackey , Nicolo Fusi

Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process

In regression with random design, we study the problem of selecting a model that performs well for out-of-sample prediction. We do not assume that any of the candidate models under consideration are correct. Our analysis is based on…

Methodology · Statistics 2008-10-24 Hannes Leeb

Optimal Sub-sampling with Influence Functions

Sub-sampling is a common and often effective method to deal with the computational challenges of large datasets. However, for most statistical models, there is no well-motivated approach for drawing a non-uniform subsample. We show that the…

Machine Learning · Statistics 2017-09-07 Daniel Ting , Eric Brochu

Aligning Learning and Endogenous Decision-Making

Many of the observations we make are biased by our decisions. For instance, the demand of items is impacted by the prices set, and online checkout choices are influenced by the assortments presented. The challenge in decision-making under…

Machine Learning · Computer Science 2025-07-02 Rares Cristian , Pavithra Harsha , Georgia Perakis , Brian Quanz

Learning to Optimize via Information-Directed Sampling

We propose information-directed sampling -- a new approach to online optimization problems in which a decision-maker must balance between exploration and exploitation while learning from partial feedback. Each action is sampled in a manner…

Machine Learning · Computer Science 2017-07-10 Daniel Russo , Benjamin Van Roy