Finite mixture regression: A sparse variable selection by model selection for clustering

doi:10.48550/arXiv.1409.1331

Finite mixture regression: A sparse variable selection by model selection for clustering

Devijver, Emilie

We consider a finite mixture of Gaussian regression model for high- dimensional data, where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by a maximum likelihood estimator, restricted on relevant variables selected by an 1-penalized maximum likelihood estimator. We get an oracle inequality satisfied by this estimator with a Jensen-Kullback-Leibler type loss. Our oracle inequality is deduced from a general model selection theorem for maximum likelihood estimators with a random model collection. We can derive the penalty shape of the criterion, which depends on the complexity of the random model collection.

Publication:

arXiv e-prints

Pub Date:

September 2014

DOI:

10.48550/arXiv.1409.1331

arXiv:

arXiv:1409.1331

Bibcode:

2014arXiv1409.1331D

Keywords:

Mathematics - Statistics Theory

E-Print:

20 pages. arXiv admin note: text overlap with arXiv:1103.2021 by other authors

NASA/ADS

Finite mixture regression: A sparse variable selection by model selection for clustering

Abstract