English

Predictive PAC learnability: a paradigm for learning from exchangeable input data

Machine Learning 2016-11-18 v2

Abstract

Exchangeable random variables form an important and well-studied generalization of i.i.d. variables, however simple examples show that no nontrivial concept or function classes are PAC learnable under general exchangeable data inputs X1,X2,X_1,X_2,\ldots. Inspired by the work of Berti and Rigo on a Glivenko--Cantelli theorem for exchangeable inputs, we propose a new paradigm, adequate for learning from exchangeable data: predictive PAC learnability. A learning rule L\mathcal L for a function class F\mathscr F is predictive PAC if for every \e,δ>0\e,\delta>0 and each function fFf\in {\mathscr F}, whenever \absσs(δ,\e)\abs{\sigma}\geq s(\delta,\e), we have with confidence 1δ1-\delta that the expected difference between f(Xn+1)f(X_{n+1}) and the image of fσf\vert\sigma under L\mathcal L does not exceed \e\e conditionally on X1,X2,,XnX_1,X_2,\ldots,X_n. Thus, instead of learning the function ff as such, we are learning to a given accuracy \e\e the predictive behaviour of ff at the future points Xi(ω)X_i(\omega), i>ni>n of the sample path. Using de Finetti's theorem, we show that if a universally separable function class F\mathscr F is distribution-free PAC learnable under i.i.d. inputs, then it is distribution-free predictive PAC learnable under exchangeable inputs, with a slightly worse sample complexity.

Keywords

Cite

@article{arxiv.1006.1129,
  title  = {Predictive PAC learnability: a paradigm for learning from exchangeable input data},
  author = {Vladimir Pestov},
  journal= {arXiv preprint arXiv:1006.1129},
  year   = {2016}
}

Comments

5 pages, latex, a postprint correcting a typo in the main definition 4.1

R2 v1 2026-06-21T15:32:33.471Z