MWU-aware Part-of-Speech Tagging with a CRF model and lexical resources - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

MWU-aware Part-of-Speech Tagging with a CRF model and lexical resources

Mathieu Constant
Anthony Sigogne
  • Fonction : Auteur
  • PersonId : 764797
  • IdRef : 167754998

Résumé

This paper describes a new part-of-speech tagger including multiword unit (MWU) identification. It is based on a Conditional Random Field model integrating language-independent features, as well as features computed from external lexical resources. It was implemented in a finite-state framework composed of a preliminary finite-state lexical analysis and a CRF decoding using weighted finite-state transducer composition. We showed that our tagger reaches state-of-the-art results for French in the standard evaluation conditions (i.e. each multiword unit is already merged in a single token). The evaluation of the tagger integrating MWU recognition clearly shows the interest of incorporating features based on MWU resources.
Fichier principal
Vignette du fichier
constant_sigogne.pdf (93.63 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00621585 , version 1 (11-09-2013)

Identifiants

  • HAL Id : hal-00621585 , version 1

Citer

Mathieu Constant, Anthony Sigogne. MWU-aware Part-of-Speech Tagging with a CRF model and lexical resources. ACL Workshop on Multiword Expressions: from Parsing and Generation to the Real World (MWE'11), 2011, Portland, Oregon, United States. pp.49-56. ⟨hal-00621585⟩
210 Consultations
412 Téléchargements

Partager

Gmail Facebook X LinkedIn More