Boyer-Moore strategy to efficient approximate string matching - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 1996

Boyer-Moore strategy to efficient approximate string matching

Résumé

We propose a simple but e cient algorithm for searching all occurrences of a pattern or a class of patterns (length m) in a text (length n) with at most k mismatches. This algorithm relies on the Shift-Add algorithm of Baeza-Yates and Gonnet [6], which involves representing by a bit number the current state of the search and uses the ability of programming languages to handle bit words. State representation should not, therefore, exceeds the word size w, that is, m(⌈log2(k+1)⌉+1 )≤w. This algorithm consists in a preprocessing step and a searching step. It is linear and performs 3n operations during the searching step. Notions of shift and character skip found in the Boyer-Moore (BM) [9] approach, are introduced in this algorithm. Provided that the considered alphabet is large enough (compared to the Pattern length), the average number of operations performed by our algorithm during the searching step becomes n(2+(k+4)/(m-k)).
Fichier principal
Vignette du fichier
cpm96.pdf (319.82 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00620020 , version 1 (26-03-2013)

Identifiants

  • HAL Id : hal-00620020 , version 1

Citer

Nadia El Mabrouk, Maxime Crochemore. Boyer-Moore strategy to efficient approximate string matching. Combinatorial Pattern Matching (Labuna Beach, California, 1996), 1996, France. pp.24-38. ⟨hal-00620020⟩
97 Consultations
790 Téléchargements

Partager

Gmail Facebook X LinkedIn More