Coding over Sets for DNA Storage

doi:10.48550/arXiv.1801.04882

Coding over Sets for DNA Storage

In this paper, we study error-correcting codes for the storage of data in synthetic deoxyribonucleic acid (DNA). We investigate a storage model where data is represented by an unordered set of $M$ sequences, each of length $L$. Errors within that model are losses of whole sequences and point errors inside the sequences, such as substitutions, insertions and deletions. We propose code constructions which can correct these errors with efficient encoders and decoders. By deriving upper bounds on the cardinalities of these codes using sphere packing arguments, we show that many of our codes are close to optimal.

Publication:

arXiv e-prints

Pub Date:

January 2018

DOI:

10.48550/arXiv.1801.04882

arXiv:

arXiv:1801.04882

Bibcode:

2018arXiv180104882L

Keywords:

Computer Science - Information Theory;
94B60

E-Print:

5 pages

NASA/ADS

Coding over Sets for DNA Storage

Abstract