Deterministic $O(1)$-Approximation Algorithms to 1-Center Clustering with Outliers
Abstract
The 1-center clustering with outliers problem asks about identifying a prototypical robust statistic that approximates the location of a cluster of points. Given some constant $0 < \alpha < 1$ and $n$ points such that $\alpha n$ of them are in some (unknown) ball of radius $r,$ the goal is to compute a ball of radius $O(r)$ that also contains $\alpha n$ points. This problem can be formulated with the points in a normed vector space such as $\mathbb{R}^d$ or in a general metric space. The problem has a simple randomized solution: a randomly selected point is a correct solution with constant probability, and its correctness can be verified in linear time. However, the deterministic complexity of this problem was not known. In this paper, for any $\ell_p$ vector space, we show an $O(nd)$-time solution with a ball of radius $O(r)$ for a fixed $\alpha > \frac{1}{2},$ and for any normed vector space, we show an $O(nd)$-time solution with a ball of radius $O(r)$ when $\alpha > \frac{1}{2}$ as well as an $O (nd \log^{(k)}(n))$-time solution with a ball of radius $O(r)$ for all $\alpha > 0, k \in \mathbb{N},$ where $\log^{(k)}(n)$ represents the $k$th iterated logarithm, assuming distance computation and vector space operations take $O(d)$ time. For an arbitrary metric space, we show for any $C \in \mathbb{N}$ an $O(n^{1+1/C})$-time solution that finds a ball of radius $2Cr,$ assuming distance computation between any pair of points takes $O(1)$-time. Moreover, this algorithm is optimal for general metric spaces, as we show that for any fixed $\alpha, C,$ there is no $o(n^{1+1/C})$-query and thus no $o(n^{1+1/C})$-time solution that deterministically finds a ball of radius $2Cr$.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2018
- DOI:
- 10.48550/arXiv.1806.07356
- arXiv:
- arXiv:1806.07356
- Bibcode:
- 2018arXiv180607356N
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- 68W25
- E-Print:
- 16 pages, 1 figure. Preliminary version in APPROX, 2018. Keywords: Deterministic, approximation algorithm, cluster, statistic