On the combinatorics of sparsification
Abstract
Background: We study the sparsification of dynamic programming folding algorithms of RNA structures. Sparsification applies to the mfe-folding of RNA structures and can lead to a significant reduction of time complexity. Results: We analyze the sparsification of a particular decomposition rule, $\Lambda^*$, that splits an interval for RNA secondary and pseudoknot structures of fixed topological genus. Essential for quantifying the sparsification is the size of its so called candidate set. We present a combinatorial framework which allows by means of probabilities of irreducible substructures to obtain the expected size of the set of $\Lambda^*$-candidates. We compute these expectations for arc-based energy models via energy-filtered generating functions (GF) for RNA secondary structures as well as RNA pseudoknot structures. For RNA secondary structures we also consider a simplified loop-energy model. This combinatorial analysis is then compared to the expected number of $\Lambda^*$-candidates obtained from folding mfe-structures. In case of the mfe-folding of RNA secondary structures with a simplified loop energy model our results imply that sparsification provides a reduction of time complexity by a constant factor of 91% (theory) versus a 96% reduction (experiment). For the "full" loop-energy model there is a reduction of 98% (experiment).
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2011
- DOI:
- 10.48550/arXiv.1201.0308
- arXiv:
- arXiv:1201.0308
- Bibcode:
- 2012arXiv1201.0308H
- Keywords:
-
- Mathematics - Combinatorics;
- 32Q55
- E-Print:
- 27 pages, 12 figures