Identifying centromeric satellites with dna-brnn
Abstract
Summary: Human alpha satellite and satellite 2/3 contribute to several percent of the human genome. However, identifying these sequences with traditional algorithms is computationally intensive. Here we develop dna-brnn, a recurrent neural network to learn the sequences of the two classes of centromeric repeats. It achieves high similarity to RepeatMasker and is times faster. Dna-brnn explores a novel application of deep learning and may accelerate the study of the evolution of the two repeat classes. Availability and implementation: https://github.com/lh3/dna-nn Contact: hli@jimmy.harvard.edu
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2019
- DOI:
- 10.48550/arXiv.1901.07327
- arXiv:
- arXiv:1901.07327
- Bibcode:
- 2019arXiv190107327L
- Keywords:
-
- Quantitative Biology - Genomics