present the VNC Tokens dataset, containing almost 3,000 occurrences of 53 Verb+Noun combinations in direct object relation, annotated as literal or idiomatic. In all, only 18% of all combinations were annotated as literal, which is roughly consistent with our study. Hashimoto and Kawahara (2008) offer a Japanese counterpart of these resources, with 146 idioms and over 102,000 example sentences. Sentences were automatically preselected in a corpus if they contained occurrences of the components of a reference MWE, and if the dependencies between those components were "canonical". This probably means that syntactic variability in LOs is underrepresented in this dataset. The authors mention that "some idioms are short of examples, 162 sentences from the British National Corpus in which verb-object pairs formed with do, get, give, have, make, and take are marked as positive and negative examples of LVCs, 2008. ,
Multiword Expressions, Handbook of Natural Language Processing, pp.267-292, 2010. ,
G h ost-PV: A Representative Gold Standard of German Particle Verbs, Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon, 2016. ,
Literal analysis and idiom retrieval in ambiguous idioms processing: A reading-time study, Journal of Cognitive Psychology, vol.27, issue.7, pp.797-811, 2015. ,
Multiword Expression Processing: A Survey, Computational Linguistics, 2017. ,
URL : https://hal.archives-ouvertes.fr/halshs-01665254
The VNC-Tokens Dataset, Proceedings of the Workshop on Multiword Expressions, 2008. ,
Unsupervised Compositionality Prediction of Nominal Compounds. Computational Linguistics, 2019. ,
, WALS Online. Max Planck Institute for Evolutionary Anthropology, 2013.
Statistical Measures for Characterising MWEs, IC1207 COST PARSEME 5th general meeting, 2015. ,
Unsupervised Type and Token Identification of Idiomatic Expressions, Computational Linguistics, vol.35, issue.1, pp.61-103, 2009. ,
Spilling the bag" on idiomatic variation, Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop, pp.1-33, 2018. ,
Studies in the Way of Words, 1989. ,
Construction of an Idiom Corpus and its Application to Idiom Identification based on WSD Incorporating Idiom-Specific Features, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp.992-1001, 2008. ,
Verbal Multiword Expressions in Basque Corpora, Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pp.86-95, 2018. ,
Automatic Identification of Non-Compositional MultiWord Expressions using Latent Semantic Analysis, Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties, pp.12-19, 2006. ,
DOI : 10.3115/1613692.1613696
URL : http://dl.acm.org/ft_gateway.cfm?id=1613696&type=pdf
Distinguishing Literal and Non-Literal Usage of German Particle Verbs, Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.353-362, 2016. ,
Same syntax, different semantics: A compositional approach to idiomaticity in multi-word expressions, Syntax and Semantics 11, pp.111-140, 2016. ,
, Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop, pp.87-147, 2018.
Universal Dependencies v1: A Multilingual Treebank Collection ,
, Proceedings of the Tenth International Conference on Language Resources and Evaluation , LREC 2016, pp.1659-1666, 2016.
From Lexical Functional Grammar to Enhanced Universal Dependencies: Linguistically informed treebanks of Polish. Institute of Computer Science, Polish Academy of Sciences, 2018. ,
Structure lexico-sentaxique des locutions du français et incidence sur leur combinatoire, 2017. ,
Automatic Idiom Recognition with Word Embeddings, SIMBig (Revised Selected Papers), vol.656, pp.17-29, 2016. ,
DOI : 10.1007/978-3-319-55209-5_2
Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.2019-2027, 2014. ,
DOI : 10.3115/v1/d14-1216
URL : https://doi.org/10.3115/v1/d14-1216
The figurative and literal senses of idioms, or all idioms are not used equally, Journal of Psycholinguistic Research, vol.17, issue.6, pp.475-487, 1988. ,
DOI : 10.1007/bf01067912
Phraseology in two Slavic Valency Dictionaries: Limitations and Perspectives, International Journal of Lexicography, vol.30, issue.1, pp.1-38, 2017. ,
How Naked is the Naked Truth? A Multilingual Lexicon of Nominal Compound Compositionality, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol.2, pp.156-161, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01459911
Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions, Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pp.222-240, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01865575
The alleged priority of literal interpretation, Cognitive Science, vol.19, pp.207-232, 1995. ,
URL : https://hal.archives-ouvertes.fr/ijn_00000181
Literal readings of multiword expressions: as scarce as hen's teeth, Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories (TLT 16), pp.64-72, 2018. ,
The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions, Proceedings of the EACL'17 Workshop on Multiword Expressions, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01865575
Fe derico Sangati, Ivelina Stoyanova, and Veronika Vincze. PARSEME multilingual corpus of verbal multiword expressions, Multiword expressions at length and in depth. Extended papers from the MWE 2017 workshop, pp.87-147, 2018. ,
Verbal MWEs: Idiomaticity and flexibility, Representation and Parsing of Multiword Expressions, pp.5-38, 2019. ,
Learning English Light Verb Constructions: Contextual or Statistical, Proceedings of the Workshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE '11, pp.31-39, 2011. ,
Sorting out the Most Confusing English Phrasal Verbs, Proceedings of the 6th International Workshop on Semantic Evaluation, SemEval '12, vol.1, pp.65-69, 2012. ,
Promoting multiword expressions in A* TAG parsing, COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, pp.429-439, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01378903