Substring Density Estimation from Traces
Abstract
In the trace reconstruction problem, one seeks to reconstruct a binary string $s$ from a collection of traces, each of which is obtained by passing $s$ through a deletion channel. It is known that $\exp(\tilde O(n^{1/5}))$ traces suffice to reconstruct any length$n$ string with high probability. We consider a variant of the trace reconstruction problem where the goal is to recover a "density map" that indicates the locations of each length$k$ substring throughout $s$. We show that $\epsilon^{2}\cdot \text{poly}(n)$ traces suffice to recover the density map with error at most $\epsilon$. As a result, when restricted to a set of source strings whose minimum "density map distance" is at least $1/\text{poly}(n)$, the trace reconstruction problem can be solved with polynomially many traces.
 Publication:

arXiv eprints
 Pub Date:
 October 2022
 DOI:
 10.48550/arXiv.2210.10917
 arXiv:
 arXiv:2210.10917
 Bibcode:
 2022arXiv221010917M
 Keywords:

 Computer Science  Information Theory;
 Computer Science  Data Structures and Algorithms;
 Mathematics  Probability;
 Mathematics  Statistics Theory
 EPrint:
 22 pages, 3 figures