Measure Estimation in the Barycentric Coding Model

doi:10.48550/arXiv.2201.12195

Measure Estimation in the Barycentric Coding Model

This paper considers the problem of measure estimation under the barycentric coding model (BCM), in which an unknown measure is assumed to belong to the set of Wasserstein-2 barycenters of a finite set of known measures. Estimating a measure under this model is equivalent to estimating the unknown barycentric coordinates. We provide novel geometrical, statistical, and computational insights for measure estimation under the BCM, consisting of three main results. Our first main result leverages the Riemannian geometry of Wasserstein-2 space to provide a procedure for recovering the barycentric coordinates as the solution to a quadratic optimization problem assuming access to the true reference measures. The essential geometric insight is that the parameters of this quadratic problem are determined by inner products between the optimal displacement maps from the given measure to the reference measures defining the BCM. Our second main result then establishes an algorithm for solving for the coordinates in the BCM when all the measures are observed empirically via i.i.d. samples. We prove precise rates of convergence for this algorithm -- determined by the smoothness of the underlying measures and their dimensionality -- thereby guaranteeing its statistical consistency. Finally, we demonstrate the utility of the BCM and associated estimation procedures in three application areas: (i) covariance estimation for Gaussian measures; (ii) image processing; and (iii) natural language processing.

Publication:

arXiv e-prints

Pub Date:

January 2022

DOI:

10.48550/arXiv.2201.12195

arXiv:

arXiv:2201.12195

Bibcode:

2022arXiv220112195W

Keywords:

Statistics - Machine Learning;
Computer Science - Data Structures and Algorithms;
Computer Science - Machine Learning;
Mathematics - Probability;
Mathematics - Statistics Theory

E-Print:

ICML 2022

ADS

Measure Estimation in the Barycentric Coding Model

Abstract