Matching reads to many genomes with the $r$-index
Abstract
The $r$-index is a tool for compressed indexing of genomic databases for exact pattern matching, which can be used to completely align reads that perfectly match some part of a genome in the database or to find seeds for reads that do not. This paper shows how to download and install the programs ri-buildfasta and ri-align; how to call ri-buildfasta on a FASTA file to build an $r$-index for that file; and how to query that index with ri-align. Availability: The source code for these programs is released under GPLv3 and available at https://github.com/alshai/r-index .
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2019
- DOI:
- 10.48550/arXiv.1908.01263
- arXiv:
- arXiv:1908.01263
- Bibcode:
- 2019arXiv190801263M
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- Quantitative Biology - Genomics