Reconstruction from Substrings with Partial Overlap
Abstract
This paper introduces a new family of reconstruction codes which is motivated by applications in DNA data storage and sequencing. In such applications, DNA strands are sequenced by reading some subset of their substrings. While previous works considered two extreme cases in which \emph{all} substrings of some fixed length are read or substrings are read with no overlap, this work considers the setup in which consecutive substrings are read with some given minimum overlap. First, upper bounds are provided on the attainable rates of codes that guarantee unique reconstruction. Then, we present efficient constructions of asymptotically optimal codes that meet the upper bound.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2022
- DOI:
- arXiv:
- arXiv:2205.03933
- Bibcode:
- 2022arXiv220503933Y
- Keywords:
-
- Computer Science - Information Theory
- E-Print:
- 6 pages, 2 figures