Related papers: Sharper Risk Bound for Multi-Task Learning with Mu…

Local Rademacher Complexity-based Learning Guarantees for Multi-Task Learning

We show a Talagrand-type concentration inequality for Multi-Task Learning (MTL), using which we establish sharp excess risk bounds for MTL in terms of distribution- and data-dependent versions of the Local Rademacher Complexity (LRC). We…

Machine Learning · Computer Science 2017-02-13 Niloofar Yousefi , Yunwen Lei , Marius Kloft , Mansooreh Mollaghasemi , Georgios Anagnostopoulos

A One-Inclusion Graph Approach to Multi-Group Learning

We prove the tightest-known upper bounds on the sample complexity of multi-group learning. Our algorithm extends the one-inclusion graph prediction strategy using a generalization of bipartite $b$-matching. In the group-realizable setting,…

Machine Learning · Computer Science 2026-04-10 Noah Bergam , Samuel Deng , Daniel Hsu

Robust Multi-Task Learning with Excess Risks

Multi-task learning (MTL) considers learning a joint model for multiple tasks by optimizing a convex combination of all task losses. To solve the optimization problem, existing methods use an adaptive weight updating scheme, where task…

Machine Learning · Computer Science 2024-07-22 Yifei He , Shiji Zhou , Guojun Zhang , Hyokun Yun , Yi Xu , Belinda Zeng , Trishul Chilimbi , Han Zhao

Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods

In this paper, we study the data-dependent convergence and generalization behavior of gradient methods for neural networks with smooth activation. Our first result is a novel bound on the excess risk of deep networks trained by the logistic…

Machine Learning · Computer Science 2024-12-09 Hossein Taheri , Christos Thrampoulidis , Arya Mazumdar

Analytical Bounds on Maximum-Likelihood Decoded Linear Codes with Applications to Turbo-Like Codes: An Overview

Upper and lower bounds on the error probability of linear codes under maximum-likelihood (ML) decoding are shortly surveyed and applied to ensembles of codes on graphs. For upper bounds, focus is put on Gallager bounding techniques and…

Information Theory · Computer Science 2007-07-13 Igal Sason , Shlomo Shamai

Excess risk bounds for multitask learning with trace norm regularization

Trace norm regularization is a popular method of multitask learning. We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution. The bounds are…

Machine Learning · Statistics 2013-01-15 Andreas Maurer , Massimiliano Pontil

Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$

The sharpest known high probability generalization bounds for uniformly stable algorithms (Feldman, Vondr\'{a}k, 2018, 2019), (Bousquet, Klochkov, Zhivotovskiy, 2020) contain a generally inevitable sampling error term of order…

Machine Learning · Computer Science 2021-11-19 Yegor Klochkov , Nikita Zhivotovskiy

Optimistic Rates for Learning with a Smooth Loss

We establish an excess risk bound of O(H R_n^2 + R_n \sqrt{H L*}) for empirical risk minimization with an H-smooth loss function and a hypothesis class with Rademacher complexity R_n, where L* is the best risk achievable by the hypothesis…

Machine Learning · Computer Science 2012-11-27 Nathan Srebro , Karthik Sridharan , Ambuj Tewari

A strong converse bound for multiple hypothesis testing, with applications to high-dimensional estimation

In statistical inference problems, we wish to obtain lower bounds on the minimax risk, that is to bound the performance of any possible estimator. A standard technique to obtain risk lower bounds involves the use of Fano's inequality. In an…

Information Theory · Computer Science 2018-04-06 Ramji Venkataramanan , Oliver Johnson

Stability and Sharper Risk Bounds with Convergence Rate $\tilde{O}(1/n^2)$

Prior work (Klochkov $\&$ Zhivotovskiy, 2021) establishes at most $O\left(\log (n)/n\right)$ excess risk bounds via algorithmic stability for strongly-convex learners with high probability. We show that under the similar common assumptions…

Machine Learning · Computer Science 2025-10-31 Bowei Zhu , Shaojie Li , Mingyang Yi , Yong Liu

Improved optimization strategies for deep Multi-Task Networks

In Multi-Task Learning (MTL), it is a common practice to train multi-task networks by optimizing an objective function, which is a weighted average of the task-specific objective functions. Although the computational advantages of this…

Machine Learning · Computer Science 2022-07-19 Lucas Pascal , Pietro Michiardi , Xavier Bost , Benoit Huet , Maria A. Zuluaga

An Information-Theoretic Analysis of the Impact of Task Similarity on Meta-Learning

Meta-learning aims at optimizing the hyperparameters of a model class or training algorithm from the observation of data from a number of related tasks. Following the setting of Baxter [1], the tasks are assumed to belong to the same task…

Machine Learning · Computer Science 2021-05-11 Sharu Theresa Jose , Osvaldo Simeone

Sample Complexity Bounds for Recurrent Neural Networks with Application to Combinatorial Graph Problems

Learning to predict solutions to real-valued combinatorial graph problems promises efficient approximations. As demonstrated based on the NP-hard edge clique cover number, recurrent neural networks (RNNs) are particularly suited for this…

Machine Learning · Statistics 2019-11-20 Nil-Jana Akpinar , Bernhard Kratzwald , Stefan Feuerriegel

Degree Heterogeneity in Higher-Order Networks: Inference in the Hypergraph $\boldsymbol{\beta}$-Model

The $\boldsymbol{\beta}$-model for random graphs is commonly used for representing pairwise interactions in a network with degree heterogeneity. Going beyond pairwise interactions, Stasi et al. (2014) introduced the hypergraph…

Statistics Theory · Mathematics 2024-06-07 Sagnik Nandy , Bhaswar B. Bhattacharya

Lower Bounds on Active Learning for Graphical Model Selection

We consider the problem of estimating the underlying graph associated with a Markov random field, with the added twist that the decoding algorithm can iteratively choose which subsets of nodes to sample based on the previous samples,…

Information Theory · Computer Science 2017-02-08 Jonathan Scarlett , Volkan Cevher

Minimax Excess Risk of First-Order Methods for Statistical Learning with Data-Dependent Oracles

In this paper, our aim is to analyse the generalization capabilities of first-order methods for statistical learning in multiple, different yet related, scenarios including supervised learning, transfer learning, robust learning and…

Machine Learning · Computer Science 2024-07-02 Kevin Scaman , Mathieu Even , Batiste Le Bars , Laurent Massoulié

Towards Optimal Problem Dependent Generalization Error Bounds in Statistical Learning Theory

We study problem-dependent rates, i.e., generalization errors that scale near-optimally with the variance, the effective loss, or the gradient norms evaluated at the "best hypothesis." We introduce a principled framework dubbed "uniform…

Machine Learning · Statistics 2020-12-25 Yunbei Xu , Assaf Zeevi

Theoretical Investigations and Practical Enhancements on Tail Task Risk Minimization in Meta Learning

Meta learning is a promising paradigm in the era of large models and task distributional robustness has become an indispensable consideration in real-world scenarios. Recent advances have examined the effectiveness of tail task risk…

Machine Learning · Computer Science 2024-10-31 Yiqin Lv , Qi Wang , Dong Liang , Zheng Xie

Offline Reinforcement Learning via Linear-Programming with Error-Bound Induced Constraints

Offline reinforcement learning (RL) aims to find an optimal policy for Markov decision processes (MDPs) using a pre-collected dataset. In this work, we revisit the linear programming (LP) reformulation of Markov decision processes for…

Machine Learning · Computer Science 2024-12-11 Asuman Ozdaglar , Sarath Pattathil , Jiawei Zhang , Kaiqing Zhang

Fast learning rates in statistical inference through aggregation

We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set $\mathcal{G}$ up to the smallest possible additive term, called the convergence rate. When the…

Statistics Theory · Mathematics 2009-09-09 Jean-Yves Audibert