English

Fast Learning from Sparse Data

Machine Learning 2015-05-19 v2 Machine Learning

Abstract

We describe two techniques that significantly improve the running time of several standard machine-learning algorithms when data is sparse. The first technique is an algorithm that effeciently extracts one-way and two-way counts--either real or expected-- from discrete data. Extracting such counts is a fundamental step in learning algorithms for constructing a variety of models including decision trees, decision graphs, Bayesian networks, and naive-Bayes clustering models. The second technique is an algorithm that efficiently performs the E-step of the EM algorithm (i.e. inference) when applied to a naive-Bayes clustering model. Using real-world data sets, we demonstrate a dramatic decrease in running time for algorithms that incorporate these techniques.

Keywords

Cite

@article{arxiv.1301.6685,
  title  = {Fast Learning from Sparse Data},
  author = {David Maxwell Chickering and David Heckerman},
  journal= {arXiv preprint arXiv:1301.6685},
  year   = {2015}
}

Comments

Appears in Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI1999)

R2 v1 2026-06-21T23:16:39.740Z