English

Limits To (Machine) Learning

Machine Learning 2025-12-16 v1 Machine Learning

Abstract

Machine learning (ML) methods are highly flexible, but their ability to approximate the true data-generating process is fundamentally constrained by finite samples. We characterize a universal lower bound, the Limits-to-Learning Gap (LLG), quantifying the unavoidable discrepancy between a model's empirical fit and the population benchmark. Recovering the true population R2R^2, therefore, requires correcting observed predictive performance by this bound. Using a broad set of variables, including excess returns, yields, credit spreads, and valuation ratios, we find that the implied LLGs are large. This indicates that standard ML approaches can substantially understate true predictability in financial data. We also derive LLG-based refinements to the classic Hansen and Jagannathan (1991) bounds, analyze implications for parameter learning in general-equilibrium settings, and show that the LLG provides a natural mechanism for generating excess volatility.

Keywords

Cite

@article{arxiv.2512.12735,
  title  = {Limits To (Machine) Learning},
  author = {Zhimin Chen and Bryan Kelly and Semyon Malamud},
  journal= {arXiv preprint arXiv:2512.12735},
  year   = {2025}
}