Abordagem probabilística para análise de confiabilidade de dados gerados em sequenciamentos multiplex na plataforma ABI SOLiD
Abstract
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer provides a mixture of all samples in a single output. This process must be secure to avoid any harm that may scramble further analysis. In this context, realized the need to develop a probabilistic model capable of assigning a degree of confidence in the marking system used in multiplex sequencing. The results confirmed the adequacy of the model obtained, which allows, among other things, to guide a process of filtering the data and evaluation of the sequencing protocol used.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2021
- DOI:
- 10.48550/arXiv.2107.13537
- arXiv:
- arXiv:2107.13537
- Bibcode:
- 2021arXiv210713537L
- Keywords:
-
- Quantitative Biology - Genomics;
- Computer Science - Computational Engineering;
- Finance;
- and Science
- E-Print:
- 8 pages, 4 figures, 2 tables, Published in Portuguese in the Anais of the XLIII Simp\'osio Brasileiro de Pesquisa Operacional (SBPO 2011), 2011. URL: http://www.din.uem.br/sbpo/sbpo2011/pdf/87903.pdf