R. ;. Tu and . Cook, present the VNC Tokens dataset, containing almost 3,000 occurrences of 53 Verb+Noun combinations in direct object relation, annotated as literal or idiomatic. In all, only 18% of all combinations were annotated as literal, which is roughly consistent with our study. Hashimoto and Kawahara (2008) offer a Japanese counterpart of these resources, with 146 idioms and over 102,000 example sentences. Sentences were automatically preselected in a corpus if they contained occurrences of the components of a reference MWE, and if the dependencies between those components were "canonical". This probably means that syntactic variability in LOs is underrepresented in this dataset. The authors mention that "some idioms are short of examples, 162 sentences from the British National Corpus in which verb-object pairs formed with do, get, give, have, make, and take are marked as positive and negative examples of LVCs, 2008.

T. Baldwin and S. N. Kim, Multiword Expressions, Handbook of Natural Language Processing, pp.267-292, 2010.

S. Bott, N. Khvtisavrishvili, M. Kisselew, and S. Schulte-im-walde, G h ost-PV: A Representative Gold Standard of German Particle Verbs, Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon, 2016.

C. Cacciari and P. Corradini, Literal analysis and idiom retrieval in ambiguous idioms processing: A reading-time study, Journal of Cognitive Psychology, vol.27, issue.7, pp.797-811, 2015.

. Constant, G. Mathieu, J. Eryi?it, L. Monti, . Van-der et al., Multiword Expression Processing: A Survey, Computational Linguistics, 2017.
URL : https://hal.archives-ouvertes.fr/halshs-01665254

P. Cook, A. Fazly, and S. Stevenson, The VNC-Tokens Dataset, Proceedings of the Workshop on Multiword Expressions, 2008.

S. Cordeiro, A. Villavicencio, M. Idiart, and C. Ramisch, Unsupervised Compositionality Prediction of Nominal Compounds. Computational Linguistics, 2019.

M. S. Dryer and M. Haspelmath, WALS Online. Max Planck Institute for Evolutionary Anthropology, 2013.

E. Maarouf, M. Ismail, and . Oakes, Statistical Measures for Characterising MWEs, IC1207 COST PARSEME 5th general meeting, 2015.

A. Fazly, P. Cook, and S. Stevenson, Unsupervised Type and Token Identification of Idiomatic Expressions, Computational Linguistics, vol.35, issue.1, pp.61-103, 2009.

K. Geeraert, R. H. Baayen, and J. Newman, Spilling the bag" on idiomatic variation, Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop, pp.1-33, 2018.

H. Grice and . Paul, Studies in the Way of Words, 1989.

C. Hashimoto and D. Kawahara, Construction of an Idiom Corpus and its Application to Idiom Identification based on WSD Incorporating Idiom-Specific Features, Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp.992-1001, 2008.

U. Inurrieta, I. Aduriz, A. Estarrona, I. Gonzalez-dios, A. Gurrutxaga et al., Verbal Multiword Expressions in Basque Corpora, Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pp.86-95, 2018.

G. Katz and E. Giesbrecht, Automatic Identification of Non-Compositional MultiWord Expressions using Latent Semantic Analysis, Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties, pp.12-19, 2006.
DOI : 10.3115/1613692.1613696

URL : http://dl.acm.org/ft_gateway.cfm?id=1613696&type=pdf

M. Köper and S. Schulte-im-walde, Distinguishing Literal and Non-Literal Usage of German Particle Verbs, Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp.353-362, 2016.

T. Lichte and L. Kallmeyer, Same syntax, different semantics: A compositional approach to idiomaticity in multi-word expressions, Syntax and Semantics 11, pp.111-140, 2016.

S. Markantonatou, C. Ramisch, A. Savary, V. Vincze, and . Preface, Multiword expressions at length and in depth: Extended papers from the MWE 2017 workshop, pp.87-147, 2018.

J. Nivre, M. De-marneffe, F. Ginter, Y. Goldberg, J. Hajic et al., Universal Dependencies v1: A Multilingual Treebank Collection

I. Calzolari, K. Nicoletta, T. Choukri, S. Declerck, M. Goggi et al., Proceedings of the Tenth International Conference on Language Resources and Evaluation , LREC 2016, pp.1659-1666, 2016.

A. Patejuk and A. Przepiórkowski, From Lexical Functional Grammar to Enhanced Universal Dependencies: Linguistically informed treebanks of Polish. Institute of Computer Science, Polish Academy of Sciences, 2018.

M. Pausé, Structure lexico-sentaxique des locutions du français et incidence sur leur combinatoire, 2017.

J. Peng and A. Feldman, Automatic Idiom Recognition with Word Embeddings, SIMBig (Revised Selected Papers), vol.656, pp.17-29, 2016.
DOI : 10.1007/978-3-319-55209-5_2

J. Peng, A. Feldman, and E. Vylomova, Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.2019-2027, 2014.
DOI : 10.3115/v1/d14-1216

URL : https://doi.org/10.3115/v1/d14-1216

S. J. Popiel and K. Mcrae, The figurative and literal senses of idioms, or all idioms are not used equally, Journal of Psycholinguistic Research, vol.17, issue.6, pp.475-487, 1988.
DOI : 10.1007/bf01067912

A. Przepiórkowski, J. Haji?, E. Hajnicz, and Z. Ure?ová, Phraseology in two Slavic Valency Dictionaries: Limitations and Perspectives, International Journal of Lexicography, vol.30, issue.1, pp.1-38, 2017.

C. Ramisch, S. Cordeiro, L. Zilio, M. Idiart, A. Villavicencio et al., How Naked is the Naked Truth? A Multilingual Lexicon of Nominal Compound Compositionality, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol.2, pp.156-161, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01459911

C. Ramisch, S. R. Cordeiro, A. Savary, V. Vincze, A. Verginica-barbu-mititelu et al., Edition 1.1 of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions, Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pp.222-240, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01865575

F. Recanati, The alleged priority of literal interpretation, Cognitive Science, vol.19, pp.207-232, 1995.
URL : https://hal.archives-ouvertes.fr/ijn_00000181

A. Savary and S. Cordeiro, Literal readings of multiword expressions: as scarce as hen's teeth, Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories (TLT 16), pp.64-72, 2018.

A. Savary, C. Ramisch, S. Cordeiro, F. Sangati, V. Vincze et al., The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions, Proceedings of the EACL'17 Workshop on Multiword Expressions, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01865575

A. Savary, M. Candito, E. Verginica-barbu-mititelu, F. Bej?ek, . Cap et al., Fe derico Sangati, Ivelina Stoyanova, and Veronika Vincze. PARSEME multilingual corpus of verbal multiword expressions, Multiword expressions at length and in depth. Extended papers from the MWE 2017 workshop, pp.87-147, 2018.

L. Sheinfux, . Herzig, A. Tali, N. Greshler, S. Melnik et al., Verbal MWEs: Idiomaticity and flexibility, Representation and Parsing of Multiword Expressions, pp.5-38, 2019.

Y. Tu and D. Roth, Learning English Light Verb Constructions: Contextual or Statistical, Proceedings of the Workshop on Multiword Expressions: From Parsing and Generation to the Real World, MWE '11, pp.31-39, 2011.

Y. Tu and D. Roth, Sorting out the Most Confusing English Phrasal Verbs, Proceedings of the 6th International Workshop on Semantic Evaluation, SemEval '12, vol.1, pp.65-69, 2012.

J. Waszczuk, A. Savary, and Y. Parmentier, Promoting multiword expressions in A* TAG parsing, COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, pp.429-439, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01378903