Barnacle: An Assembly Algorithm for Clone-based Sequences of Whole Genomes
Abstract
We propose an assembly algorithm {\sc Barnacle} for sequences generated by the clone-based approach. We illustrate our approach by assembling the human genome. Our novel method abandons the original physical-mapping-first framework. As we show, {\sc Barnacle} more effectively resolves conflicts due to repeated sequences. The latter is the main difficulty of the sequence assembly problem. Inaddition, we are able to detect inconsistencies in the underlying data. We present and compare our results on the December 2001 freeze of the public working draft of the human genome with NCBI's assembly (Build 28). The assembly of December 2001 freeze of the public working draft generated by {\sc Barnacle} and the source code of {\sc Barnacle} are available at (http://www.cs.rutgers.edu/~vchoi).
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2003
- DOI:
- arXiv:
- arXiv:cs/0302005
- Bibcode:
- 2003cs........2005C
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- Computer Science - Discrete Mathematics;
- Quantitative Biology;
- G.4;
- G2.3
- E-Print:
- 13 pages, 10 figures