Related papers: Competitive Distribution Estimation
We study the problem of oracle-efficient hybrid online learning when the features are generated by an unknown i.i.d. process and the labels are generated adversarially. Assuming access to an (offline) ERM oracle, we show that there exists a…
We study the fundamental problem of estimating an unknown discrete distribution $p$ over $d$ symbols, given $n$ i.i.d. samples from the distribution. We are interested in minimizing the KL divergence between the true distribution and the…
We consider the online distributed non-stochastic experts problem, where the distributed system consists of one coordinator node that is connected to $k$ sites, and the sites are required to communicate with each other via the coordinator.…
We are interested in testing properties of distributions with systematically mislabeled samples. Our goal is to make decisions about unknown probability distributions, using a sample that has been collected by a confused collector, such as…
We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point…
We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving,…
In reinforcement learning, specifying reward functions that capture the intended task can be very challenging. Reward learning aims to address this issue by learning the reward function. However, a learned reward model may have a low error…
Most microeconomic models of interest involve optimizing a piecewise linear function. These include contract design in hidden-action principal-agent problems, selling an item in posted-price auctions, and bidding in first-price auctions.…
We consider the optimal control of large-scale systems using distributed controllers with a network topology that mirrors the coupling graph between subsystems. In this work, we introduce spatial regret, a graph-informed metric that…
A fundamental principle of learning theory is that there is a trade-off between the complexity of a prediction rule and its ability to generalize. Modern machine learning models do not obey this paradigm: They produce an accurate prediction…
The problem of estimating the Kullback-Leibler divergence $D(P\|Q)$ between two unknown distributions $P$ and $Q$ is studied, under the assumption that the alphabet size $k$ of the distributions can scale to infinity. The estimation is…
In this paper we study the mincut problem in the online setting. We consider two distinct models: A) competitive analysis and B) regret analysis. In the competitive setting we consider the vertex arrival model; whenever a new vertex arrives…
We consider distributed statistical optimization in one-shot setting, where there are $m$ machines each observing $n$ i.i.d. samples. Based on its observed samples, each machine then sends an $O(\log(mn))$-length message to a server, at…
We consider the weakly supervised binary classification problem where the labels are randomly flipped with probability $1- {\alpha}$. Although there exist numerous algorithms for this problem, it remains theoretically unexplored how the…
In this paper, we consider learning scenarios where the learned model is evaluated under an unknown test distribution which potentially differs from the training distribution (i.e. distribution shift). The learner has access to a family of…
Auctions with partially-revealed information about items are broadly employed in real-world applications, but the underlying mechanisms have limited theoretical support. In this work, we study a machine learning formulation of these types…
Consider the classical problem of predicting the next bit in a sequence of bits. A standard performance measure is {\em regret} (loss in payoff) with respect to a set of experts. For example if we measure performance with respect to two…
This paper studies an online learning problem that seeks optimal testing policies for a stream of subjects, each of whom can be evaluated through a sequence of candidate tests drawn from a common pool. We refer to this problem as the Online…
We resolve an open question from (Christiano, 2014b) posed in COLT'14 regarding the optimal dependency of the regret achievable for online local learning on the size of the label set. In this framework the algorithm is shown a pair of items…
We consider the problem of online learning in Linear Quadratic Control systems whose state transition and state-action transition matrices $A$ and $B$ may be initially unknown. We devise an online learning algorithm and provide guarantees…