English

Incremental Seeded EM Algorithm for Clusterwise Linear Regression

Computation 2025-07-08 v1

Abstract

This paper proposes Incremental Seeded Expectation Maximization, an algorithm that improves upon the traditional Expectation Maximization computational flow for clusterwise or finite mixture linear regression tasks. The proposed method shows significantly better performance, particularly in scenarios involving high-dimensional input, noisy data, or a large number of clusters. Alongside the new algorithm, this paper introduces the concepts of Resolvability\textit{Resolvability} and X-predictability\textit{X-predictability}, which enable more rigorous discussions of clusterwise regression problems. The resolvability index is quantified using parameters derived from the model, and results demonstrate its strong connection to model quality without requiring knowledge of the ground truth. This makes the Resolvability\textit{Resolvability} especially useful for assessing the quality of clusterwise regression models, and by extension, the conclusions drawn from them.

Keywords

Cite

@article{arxiv.2507.04629,
  title  = {Incremental Seeded EM Algorithm for Clusterwise Linear Regression},
  author = {Ye Chow Kuang and Melanie Ooi},
  journal= {arXiv preprint arXiv:2507.04629},
  year   = {2025}
}

Comments

40 pages, 8 figures

R2 v1 2026-07-01T03:48:46.576Z