A Competitive Algorithm for Agnostic Active Learning

Eric Price; Yihan Zhou

A Competitive Algorithm for Agnostic Active Learning

Machine Learning 2024-05-24 v3 Data Structures and Algorithms

Authors: Eric Price , Yihan Zhou

Abstract

For some hypothesis classes and input distributions, active agnostic learning needs exponentially fewer samples than passive learning; for other classes and distributions, it offers little to no improvement. The most popular algorithms for agnostic active learning express their performance in terms of a parameter called the disagreement coefficient, but it is known that these algorithms are inefficient on some inputs. We take a different approach to agnostic active learning, getting an algorithm that is competitive with the optimal algorithm for any binary hypothesis class $H$ and distribution $D_X$ over $X$ . In particular, if any algorithm can use $m^*$ queries to get $O(\eta)$ error, then our algorithm uses $O(m^* \log |H|)$ queries to get $O(\eta)$ error. Our algorithm lies in the vein of the splitting-based approach of Dasgupta [2004], which gets a similar result for the realizable ( $\eta = 0$ ) setting. We also show that it is NP-hard to do better than our algorithm's $O(\log |H|)$ overhead in general.

Keywords

active learning computational learning theory machine learning theory

Cite

@article{arxiv.2310.18786,
  title  = {A Competitive Algorithm for Agnostic Active Learning},
  author = {Eric Price and Yihan Zhou},
  journal= {arXiv preprint arXiv:2310.18786},
  year   = {2024}
}

A Competitive Algorithm for Agnostic Active Learning

Abstract

Keywords

Cite

Related papers