Estimation of clusterwise linear regression models with a shrinkage-like approach
Abstract
Constrained approaches to maximum likelihood estimation in the context of finite mixtures of normals have been presented in the literature. A fully data-dependent constrained method for maximum likelihood estimation of clusterwise linear regression is proposed, which extends previous work in equivariant data-driven estimation of finite mixtures of Gaussians for classification. The method imposes plausible bounds on the component variances, based on a target value estimated from the data, which we take to be the homoscedastic variance. Nevertheless, the present work does not only focus on classification recovery, but also on how well model parameters are estimated. In particular, the paper sheds light on the shrinkage-like interpretation of the procedure, where the target is the homoscedastic model: this is not only related to how close to the target the estimated scales are, but extends to the estimated clusterwise linear regressions and classification. We show, based on simulation and real-data based results, that our approach yields a final model being the most appropriate-to-the-data compromise between the heteroscedastic model and the homoscedastic model.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2016
- DOI:
- 10.48550/arXiv.1611.03309
- arXiv:
- arXiv:1611.03309
- Bibcode:
- 2016arXiv161103309D
- Keywords:
-
- Statistics - Methodology