Constant Approximation for $k$Median and $k$Means with Outliers via Iterative Rounding
Abstract
In this paper, we present a new iterative rounding framework for many clustering problems. Using this, we obtain an $(\alpha_1 + \epsilon \leq 7.081 + \epsilon)$approximation algorithm for $k$median with outliers, greatly improving upon the large implicit constant approximation ratio of Chen [Chen, SODA 2018]. For $k$means with outliers, we give an $(\alpha_2+\epsilon \leq 53.002 + \epsilon)$approximation, which is the first $O(1)$approximation for this problem. The iterative algorithm framework is very versatile; we show how it can be used to give $\alpha_1$ and $(\alpha_1 + \epsilon)$approximation algorithms for matroid and knapsack median problems respectively, improving upon the previous best approximations ratios of $8$ [Swamy, ACM Trans. Algorithms] and $17.46$ [Byrka et al, ESA 2015]. The natural LP relaxation for the $k$median/$k$means with outliers problem has an unbounded integrality gap. In spite of this negative result, our iterative rounding framework shows that we can round an LP solution to an almostintegral solution of small cost, in which we have at most two fractionally open facilities. Thus, the LP integrality gap arises due to the gap between almostintegral and fullyintegral solutions. Then, using a preprocessing procedure, we show how to convert an almostintegral solution to a fullyintegral solution losing only a constantfactor in the approximation ratio. By further using a sparsification technique, the additive factor loss incurred by the conversion can be reduced to any $\epsilon > 0$.
 Publication:

arXiv eprints
 Pub Date:
 November 2017
 arXiv:
 arXiv:1711.01323
 Bibcode:
 2017arXiv171101323K
 Keywords:

 Computer Science  Data Structures and Algorithms