Construction of linguistic resources for the extraction of "complex text segments" - Archive ouverte HAL Accéder directement au contenu
Poster De Conférence Année : 2014

Construction of linguistic resources for the extraction of "complex text segments"

Résumé

The development of computational linguistic resources (electronic dictionaries and grammars) for the automatic extraction, identification, and further fine-grained annotation of "complex text segments" , is the core of our work. We use and extend the notion of multi-word units (MWUs) by allowing a large description of linguistic objects: compound nouns, entity names, verbal forms (compound tense and negate forms, introduction of clauses between the auxiliary and the past participle, etc.) and frozen expressions (i.e. idioms). The identification of complex sequences of text segments is done by using dictionary graphs which combines the power and versatility of the local grammars and the expressivity of the electronic dictionaries.
Fichier non déposé

Dates et versions

hal-01448712 , version 1 (28-01-2017)

Identifiants

  • HAL Id : hal-01448712 , version 1

Citer

Tita Kyriacopoulou, Claude Martineau, Cristian Martinez, Aggeliki Fotopoulou. Construction of linguistic resources for the extraction of "complex text segments". PARSEME 2nd general meeting, Mar 2014, Athènes, Greece. ⟨hal-01448712⟩
175 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More