Minimax Estimation of Kernel Mean Embeddings
Abstract
In this paper, we study the minimax estimation of the Bochner integral $$\mu_k(P):=\int_{\mathcal{X}} k(\cdot,x)\,dP(x),$$ also called as the kernel mean embedding, based on random samples drawn i.i.d.~from $P$, where $k:\mathcal{X}\times\mathcal{X}\rightarrow\mathbb{R}$ is a positive definite kernel. Various estimators (including the empirical estimator), $\hat{\theta}_n$ of $\mu_k(P)$ are studied in the literature wherein all of them satisfy $\bigl\ \hat{\theta}_n\mu_k(P)\bigr\_{\mathcal{H}_k}=O_P(n^{1/2})$ with $\mathcal{H}_k$ being the reproducing kernel Hilbert space induced by $k$. The main contribution of the paper is in showing that the above mentioned rate of $n^{1/2}$ is minimax in $\\cdot\_{\mathcal{H}_k}$ and $\\cdot\_{L^2(\mathbb{R}^d)}$norms over the class of discrete measures and the class of measures that has an infinitely differentiable density, with $k$ being a continuous translationinvariant kernel on $\mathbb{R}^d$. The interesting aspect of this result is that the minimax rate is independent of the smoothness of the kernel and the density of $P$ (if it exists). This result has practical consequences in statistical applications as the mean embedding has been widely employed in nonparametric hypothesis testing, density estimation, causal inference and feature selection, through its relation to energy distance (and distance covariance).
 Publication:

arXiv eprints
 Pub Date:
 February 2016
 arXiv:
 arXiv:1602.04361
 Bibcode:
 2016arXiv160204361T
 Keywords:

 Mathematics  Statistics Theory;
 62G05;
 62G07