Substring Density Estimation from Traces
Abstract
In the trace reconstruction problem, one seeks to reconstruct a binary string $s$ from a collection of traces, each of which is obtained by passing $s$ through a deletion channel. It is known that $\exp(\tilde O(n^{1/5}))$ traces suffice to reconstruct any length-$n$ string with high probability. We consider a variant of the trace reconstruction problem where the goal is to recover a "density map" that indicates the locations of each length-$k$ substring throughout $s$. We show that $\epsilon^{-2}\cdot \text{poly}(n)$ traces suffice to recover the density map with error at most $\epsilon$. As a result, when restricted to a set of source strings whose minimum "density map distance" is at least $1/\text{poly}(n)$, the trace reconstruction problem can be solved with polynomially many traces.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2022
- DOI:
- 10.48550/arXiv.2210.10917
- arXiv:
- arXiv:2210.10917
- Bibcode:
- 2022arXiv221010917M
- Keywords:
-
- Computer Science - Information Theory;
- Computer Science - Data Structures and Algorithms;
- Mathematics - Probability;
- Mathematics - Statistics Theory
- E-Print:
- 22 pages, 3 figures