Tight Hardness Results for Consensus Problems on Circular Strings and Time Series
Abstract
Consensus problems for strings and sequences appear in numerous application contexts, ranging from bioinformatics over data mining to machine learning. Closing some gaps in the literature, we show that several fundamental problems in this context are NP- and W[1]-hard, and that the known (partially brute-force) algorithms are close to optimality assuming the Exponential Time Hypothesis. Among our main contributions is to settle the complexity status of computing a mean in dynamic time warping spaces which, as pointed out by Brill et al. [DMKD 2019], suffered from many unproven or false assumptions in the literature. We prove this problem to be NP-hard and additionally show that a recent dynamic programming algorithm is essentially optimal. In this context, we study a broad family of circular string alignment problems. This family also serves as a key for our hardness reductions, and it is of independent (practical) interest in molecular biology. In particular, we show tight hardness and running time lower bounds for Circular Consensus String; notably, the corresponding non-circular version is easily linear-time solvable.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2018
- DOI:
- 10.48550/arXiv.1804.02854
- arXiv:
- arXiv:1804.02854
- Bibcode:
- 2018arXiv180402854B
- Keywords:
-
- Computer Science - Discrete Mathematics;
- Computer Science - Data Structures and Algorithms