A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue SIAM Journal on Computing Année : 2003

A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices

Résumé

Given two strings of size n over a constant alphabet, the classical algorithm for computing the similarity between two sequences [D. Sankoff and J. B. Kruskal, eds., Time Warps, String Edits, and Macromolecules; Addison-Wesley, Reading, MA, 1983; T. F. Smith and M. S. Waterman, J. Molec. Biol., 147 (1981), pp. 195-197] uses a dynamic programming matrix and compares the two strings in O(n²) time. We address the challenge of computing the similarity of two strings in subquadratic time for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global similarity computations. The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n² / log n) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually O(h n² / log n), where h≤1 is the entropy of the text. We also present an algorithm for comparing two run-length encoded strings of length m and n, compressed into m' and n' runs, respectively, in O(m'n+n'm) complexity. This result extends to all distance or similarity scoring schemes that use an additive gap penalty.
Fichier principal
Vignette du fichier
02-CLU-TR.pdf (500.46 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00619573 , version 1 (20-03-2013)

Identifiants

Citer

Maxime Crochemore, Gad M. Landau, Michal Ziv-Ukelson. A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices. SIAM Journal on Computing, 2003, 32 (6), pp.1654-1673. ⟨10.1137/S0097539702402007⟩. ⟨hal-00619573⟩
169 Consultations
407 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More