A characterization of the number of subsequences obtained via the deletion channel
Abstract
Motivated by the study of deletion channels, this work presents improved bounds on the number of subsequences obtained from a binary sting X of length n under t deletions. It is known that the number of subsequences in this setting strongly depends on the number of runs in the string X; where a run is a maximal sequence of the same character. Our improved bounds are obtained by a structural analysis of the family of r-run strings X, an analysis in which we identify the extremal strings with respect to the number of subsequences. Specifically, for every r, we present r-run strings with the minimum (respectively maximum) number of subsequences under any t deletions; and perform an exact analysis of the number of subsequences of these extremal strings.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2012
- DOI:
- 10.48550/arXiv.1202.1644
- arXiv:
- arXiv:1202.1644
- Bibcode:
- 2012arXiv1202.1644L
- Keywords:
-
- Computer Science - Information Theory
- E-Print:
- 9 pages, 4 figures, a short version submitted to ISIT2012