Related papers: Online and Batch Learning Algorithms for Data with…
Most PAC-Bayesian bounds hold in the batch learning setting where data is collected at once, prior to inference or prediction. This somewhat departs from many contemporary learning problems where data streams are collected and the…
In this paper, we study the generalization properties of online learning based stochastic methods for supervised learning problems where the loss function is dependent on more than one training sample (e.g., metric learning, ranking). We…
We develop a novel family of algorithms for the online learning setting with regret against any data sequence bounded by the empirical Rademacher complexity of that sequence. To develop a general theory of when this type of adaptive regret…
We propose new bounds on the error of learning algorithms in terms of a data-dependent notion of complexity. The estimates we establish give optimal rates and are based on a local and empirical version of Rademacher averages, in the sense…
We propose a general framework for studying adaptive regret bounds in the online learning framework, including model selection bounds and data-dependent bounds. Given a data- or model-dependent bound we ask, "Does there exist some algorithm…
We present a new analysis of the problem of learning with drifting distributions in the batch setting using the notion of discrepancy. We prove learning bounds based on the Rademacher complexity of the hypothesis set and the discrepancy of…
In this book, I introduce the basic concepts of Online Learning through the modern view of Online Convex Optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I present first-order…
We study the task of online boosting--combining online weak learners into an online strong learner. While batch boosting has a sound theoretical foundation, online boosting deserves more study from the theoretical perspective. In this…
Spurred by the enthusiasm surrounding the "Big Data" paradigm, the mathematical and algorithmic tools of online optimization have found widespread use in problems where the trade-off between data exploration and exploitation plays a…
Online learning methods yield sequential regret bounds under minimal assumptions and provide in-expectation risk bounds for statistical learning. However, despite the apparent advantage of online guarantees over their statistical…
Despite their benefits in terms of simplicity, low computational cost and data requirement, parametric machine learning algorithms, such as linear discriminant analysis, quadratic discriminant analysis or logistic regression, suffer from…
We study online convex optimization under stochastic sub-gradient observation faults, where we introduce adaptive algorithms with minimax optimal regret guarantees. We specifically study scenarios where our sub-gradient observations can be…
This work focuses on the setting of dynamic regret in the context of online learning with full information. In particular, we analyze regret bounds with respect to the temporal variability of the loss functions. By assuming that the…
Modern applications in sensitive domains such as biometrics and medicine frequently require the use of non-decomposable loss functions such as precision@k, F-measure etc. Compared to point loss functions such as hinge-loss, these offer much…
When faced with high frequency streams of data, clustering raises theoretical and algorithmic pitfalls. We introduce a new and adaptive online clustering algorithm relying on a quasi-Bayesian approach, with a dynamic (i.e., time-dependent)…
Batch training of machine learning models based on neural networks is now well established, whereas to date streaming methods are largely based on linear models. To go beyond linear in the online setting, nonparametric methods are of…
We present a framework and analysis of consistent binary classification for complex and non-decomposable performance metrics such as the F-measure and the Jaccard measure. The proposed framework is general, as it applies to both batch and…
In the algorithm selection research, the discussion surrounding algorithm features has been significantly overshadowed by the emphasis on problem features. Although a few empirical studies have yielded evidence regarding the effectiveness…
This paper re-examines the problem of parameter estimation in Bayesian networks with missing values and hidden variables from the perspective of recent work in on-line learning [Kivinen & Warmuth, 1994]. We provide a unified framework for…
We analyze and evaluate an online gradient descent algorithm with adaptive per-coordinate adjustment of learning rates. Our algorithm can be thought of as an online version of batch gradient descent with a diagonal preconditioner. This…