Detection of low dimensionality and data denoising via set estimation techniques
Abstract
This work is closely related to the theories of set estimation and manifold estimation. Our object of interest is a, possibly lower-dimensional, compact set $S \subset {\mathbb R}^d$. The general aim is to identify (via stochastic procedures) some qualitative or quantitative features of $S$, of geometric or topological character. The available information is just a random sample of points drawn on $S$. The term "to identify" means here to achieve a correct answer almost surely (a.s.) when the sample size tends to infinity. More specifically the paper aims at giving some partial answers to the following questions: is $S$ full dimensional? Is $S$ "close to a lower dimensional set" $\mathcal{M}$? If so, can we estimate $\mathcal{M}$ or some functionals of $\mathcal{M}$ (in particular, the Minkowski content of $\mathcal{M}$)? As an important auxiliary tool in the answers of these questions, a denoising procedure is proposed in order to partially remove the noise in the original data. The theoretical results are complemented with some simulations and graphical illustrations.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2017
- DOI:
- arXiv:
- arXiv:1702.05193
- Bibcode:
- 2017arXiv170205193A
- Keywords:
-
- Mathematics - Statistics Theory