Approximating Weighted Duo-Preservation in Comparative Genomics
Abstract
Motivated by comparative genomics, Chen et al. [9] introduced the Maximum Duo-preservation String Mapping (MDSM) problem in which we are given two strings $s_1$ and $s_2$ from the same alphabet and the goal is to find a mapping $\pi$ between them so as to maximize the number of duos preserved. A duo is any two consecutive characters in a string and it is preserved in the mapping if its two consecutive characters in $s_1$ are mapped to same two consecutive characters in $s_2$. The MDSM problem is known to be NP-hard and there are approximation algorithms for this problem [3, 5, 13], but all of them consider only the "unweighted" version of the problem in the sense that a duo from $s_1$ is preserved by mapping to any same duo in $s_2$ regardless of their positions in the respective strings. However, it is well-desired in comparative genomics to find mappings that consider preserving duos that are "closer" to each other under some distance measure [19]. In this paper, we introduce a generalized version of the problem, called the Maximum-Weight Duo-preservation String Mapping (MWDSM) problem that captures both duos-preservation and duos-distance measures in the sense that mapping a duo from $s_1$ to each preserved duo in $s_2$ has a weight, indicating the "closeness" of the two duos. The objective of the MWDSM problem is to find a mapping so as to maximize the total weight of preserved duos. In this paper, we give a polynomial-time 6-approximation algorithm for this problem.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2017
- DOI:
- 10.48550/arXiv.1708.09325
- arXiv:
- arXiv:1708.09325
- Bibcode:
- 2017arXiv170809325M
- Keywords:
-
- Computer Science - Data Structures and Algorithms
- E-Print:
- Appeared in proceedings of the 23rd International Computing and Combinatorics Conference (COCOON 2017)