LinearTime Algorithm for Long LCF with $k$ Mismatches
Abstract
In the Longest Common Factor with $k$ Mismatches (LCF$_k$) problem, we are given two strings $X$ and $Y$ of total length $n$, and we are asked to find a pair of maximallength factors, one of $X$ and the other of $Y$, such that their Hamming distance is at most $k$. Thankachan et al. show that this problem can be solved in $\mathcal{O}(n \log^k n)$ time and $\mathcal{O}(n)$ space for constant $k$. We consider the LCF$_k$($\ell$) problem in which we assume that the sought factors have length at least $\ell$, and the LCF$_k$($\ell$) problem for $\ell=\Omega(\log^{2k+2} n)$, which we call the Long LCF$_k$ problem. We use difference covers to reduce the Long LCF$_k$ problem to a task involving $m=\mathcal{O}(n/\log^{k+1}n)$ synchronized factors. The latter can be solved in $\mathcal{O}(m \log^{k+1}m)$ time, which results in a lineartime algorithm for Long LCF$_k$. In general, our solution to LCF$_k$($\ell$) for arbitrary $\ell$ takes $\mathcal{O}(n + n \log^{k+1} n/\sqrt{\ell})$ time.
 Publication:

arXiv eprints
 Pub Date:
 February 2018
 arXiv:
 arXiv:1802.06369
 Bibcode:
 2018arXiv180206369C
 Keywords:

 Computer Science  Data Structures and Algorithms
 EPrint:
 submitted to CPM 2018