Finite mixture regression: A sparse variable selection by model selection for clustering
Abstract
We consider a finite mixture of Gaussian regression model for high- dimensional data, where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by a maximum likelihood estimator, restricted on relevant variables selected by an 1-penalized maximum likelihood estimator. We get an oracle inequality satisfied by this estimator with a Jensen-Kullback-Leibler type loss. Our oracle inequality is deduced from a general model selection theorem for maximum likelihood estimators with a random model collection. We can derive the penalty shape of the criterion, which depends on the complexity of the random model collection.
Cite
@article{arxiv.1409.1331,
title = {Finite mixture regression: A sparse variable selection by model selection for clustering},
author = {Emilie Devijver},
journal= {arXiv preprint arXiv:1409.1331},
year = {2014}
}
Comments
20 pages. arXiv admin note: text overlap with arXiv:1103.2021 by other authors