Skip to Main content Skip to Navigation
Journal articles

Regular expression constrained sequence alignment revisited

Abstract : Imposing constraints in the form of a finite automaton or a regular expression is an effective way to incorporate additional a priori knowledge into sequence alignment procedures. With this motivation, the Regular Expression Constrained Sequence Alignment Problem was introduced, which proposed an O(n^2t^4) time and O(n^2t^2) space algorithm for solving it, where n is the length of the input strings and t is the number of states in the input non-deterministic automaton. A faster O(n^2t^3) time algorithm for the same problem was subsequently proposed. In this article, we further speed up the algorithms for Regular Language Constrained Sequence Alignment by reducing their worst case time complexity bound to O(n^2t^3/log t). This is done by establishing an optimal bound on the size of Straight-Line Programs solving the maxima computation subproblem of the basic dynamic programming algorithm. We also study another solution based on a Steiner Tree computation. While it does not improve worst case, our simulations show that both approaches are efficient in practice, especially when the input automata are dense.
Complete list of metadatas

Cited literature [26 references]  Display  Hide  Download

https://hal-upec-upem.archives-ouvertes.fr/hal-00790008
Contributor : Gregory Kucherov <>
Submitted on : Tuesday, February 19, 2013 - 11:23:31 AM
Last modification on : Tuesday, December 17, 2019 - 4:16:01 PM
Long-term archiving on: : Monday, May 20, 2013 - 4:00:47 AM

File

CMB-2010-0291-Kucherov_1P.pdf
Publisher files allowed on an open archive

Identifiers

Citation

Gregory Kucherov, Tamar Pinhas, Michal Ziv-Ukelson. Regular expression constrained sequence alignment revisited. Journal of Computational Biology, Mary Ann Liebert, 2011, 18 (5), pp.771-781. ⟨10.1089/cmb.2010.0291⟩. ⟨hal-00790008⟩

Share

Metrics

Record views

652

Files downloads

540