In addition to investigating several properties of word embeddings and neural networks, we also experiment with transfer and multi-task learning for compound interpretation-two machine learning techniques that have recently drawn much attention in NLP research. Overall, this thesis stands at the intersection of many of the recent, and not-so-recent, developments in NLP. Our experimental setup is systemically varied to account for different properties of the models we use, in isolation and in combination. Third, we empirically determine the utility of distributional semantic models as well as of neural networks to classify the relations holding between the compound constituents. NomBank and PCEDT), in contradistinction to the more isolated, compound-centric perspective of much past work. Specifically, we introduce a new approach (and a new resource) that derives the semantic interpretation of noun–noun compounds from linguistic resources that represent the semantics of phrasal or sentential structures (viz. Second, we part company with past natural language processing (NLP) studies on compound analysis and resituate the problem within general-purpose whole-sentence meaning representation frameworks. First, of five compound analysis tasks, we focus primarily on compound interpretation and identification, but we also create a resource for compound bracketing and reflect on (the need for) compound sense disambiguation in our work. The holistic nature of our work manifests in three respects. In this thesis, we present a study of English noun–noun compound analysis that takes a holistic perspective on the problem. Although several studies have been devoted to this topic, to the best of our knowledge, our study is the first attempt to provide a general framework for the identification of VMWEs in running texts and a comprehensive corpus for the Italian language. They pose many challenges to their correct identification and processing: they are a linguistic phenomenon on the edge between lexicon and grammar, their meaning is not simply the addition of the meanings of the single constituents of the MWEs and they are ambiguous since in several cases their reading can be literal or idiomatic. Notably, MWEs are pervasive in natural languages but are particularly difficult to be handled by NLP tools because of their characteristics and idiomaticity. Besides they are very challenging for deep parsing and other Natural Language Processing (NLP) tasks. They are a particularly interesting lexical phenomenon because of frequent discontinuity and long-distance dependency. The first phase of the project was devoted to verbal multiword expressions (VMWEs). The project is a spin-off of a larger multilingual project for more than 20 languages from several language families, namely the PARSEME COST Action. The paper describes the PARSEME-It corpus, developed within the PARSEME-It project which aims at the development of methods, tools and resources for multiword expressions (MWE) processing for the Italian language.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |