Beyond Black Box Densities: Parameter Learning for the Deviated Components
Abstract
As we collect additional samples from a data population for which a known density function estimate may have been previously obtained by a black box method, the increased complexity of the data set may result in the true density being deviated from the known estimate by a mixture distribution. To model this phenomenon, we consider the \emph{deviating mixture model} $(1\lambda^{*})h_0 + \lambda^{*} (\sum_{i = 1}^{k} p_{i}^{*} f(x\theta_{i}^{*}))$, where $h_0$ is a known density function, while the deviated proportion $\lambda^{*}$ and latent mixing measure $G_{*} = \sum_{i = 1}^{k} p_{i}^{*} \delta_{\theta_i^{*}}$ associated with the mixture distribution are unknown. Via a novel notion of distinguishability between the known density $h_{0}$ and the deviated mixture distribution, we establish rates of convergence for the maximum likelihood estimates of $\lambda^{*}$ and $G^{*}$ under Wasserstein metric. Simulation studies are carried out to illustrate the theory.
 Publication:

arXiv eprints
 Pub Date:
 February 2022
 DOI:
 10.48550/arXiv.2202.02651
 arXiv:
 arXiv:2202.02651
 Bibcode:
 2022arXiv220202651D
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Machine Learning;
 Mathematics  Statistics Theory
 EPrint:
 Accepted at NeurIPS 2022. Dat Do and Nhat Ho contributed equally to this work