All instantiations of the greedy algorithm for the shortest superstring problem are equivalent
Abstract
In the Shortest Common Superstring problem (SCS), one needs to find the shortest superstring for a set of strings. While SCS is NP-hard and MAX-SNP-hard, the Greedy Algorithm "choose two strings with the largest overlap; merge them; repeat" achieves a constant factor approximation that is known to be at most 3.5 and conjectured to be equal to 2. The Greedy Algorithm is not deterministic, so its instantiations with different tie-breaking rules may have different approximation factors. In this paper, we show that it is not the case: all factors are equal. To prove this, we show how to transform a set of strings so that all overlaps are different whereas their ratios stay roughly the same. We also reveal connections between the original version of SCS and the following one: find a~superstring minimizing the number of occurrences of a given symbol. It turns out that the latter problem is equivalent to the original one.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2021
- DOI:
- 10.48550/arXiv.2102.05579
- arXiv:
- arXiv:2102.05579
- Bibcode:
- 2021arXiv210205579N
- Keywords:
-
- Computer Science - Data Structures and Algorithms