A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2002

A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices

Résumé

The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in O(n²) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations.The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n²/log n) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually O(hn²/log n) where h ≤ 1 is the entropy of the text.
Fichier principal
Vignette du fichier
RECDOC02-SODA-TR.pdf (409.25 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00619996 , version 1 (20-03-2013)

Identifiants

  • HAL Id : hal-00619996 , version 1

Citer

Maxime Crochemore, Gad M. Landau, Michal Ziv-Ukelson. A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices. Proceedings of the Thirteen Annual ACM-SIAM Symposium on Discrete Algorithms, 2002, United States. pp.679-688. ⟨hal-00619996⟩
158 Consultations
40 Téléchargements

Partager

Gmail Facebook X LinkedIn More