Composition Models for the Representation and Semantic Interpretation of Nominal Compounds

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Hinrichs, Erhard (Prof. Dr.)
dc.contributor.author Dima, Gina-Corina
dc.date.accessioned 2019-03-19T08:17:39Z
dc.date.available 2019-03-19T08:17:39Z
dc.date.issued 2019-03-19
dc.identifier.other 1662385889 de_DE
dc.identifier.uri http://hdl.handle.net/10900/87098
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-870984 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-28485
dc.description.abstract The central topic of this thesis are composition models of distributional semantics and their application for representing the semantics of German and English nominal compounds. Composition models are mathematical transformations that, given a compound like Apfelbaum ‘apple tree’, can be applied to the vector representations of Apfel ‘apple’ and Baum ‘tree’ to obtain a vector representation for the compound Apfelbaum ‘apple tree’. The new composed representation is deemed appropriate if it is similar to the representation of Apfelbaum that can be directly learned from large corpora using distributional methods. The thesis is structured into eight chapters. The first four chapters introduce compounds from a linguistic perspective (Chapter 1), present a review of annotation schemes for nominal compounds and introduce a new hybrid annotation scheme (Chapter 2), introduce neural networks and how to represent words via numerical features (Chapter 3) and detail the distributional representation of words (Chapter 4). Existing composition models of distributional semantics are reviewed and evaluated in Chapter 5. Chapter 5 also introduces three new composition models: addmask, wmask and multimatrix, that aim to improve over existing composition models either though a more efficient parametrization (*mask) or by promoting parameter reuse across different, but semantically similar words (multimatrix). The results show that composition models are able to construct meaningful composed representations for 81.8% of the German test compounds, and 78.03% of the English test compounds. In Chapter 6 composed representations are shown to be a useful indicator when investigating non-compositional (lexicalized) compounds. For example, when modeling a compound like Tigerauge, ‘tiger eye’, composition models will produce a composed representation that corresponds to the literal interpretation of the compound - the eye of a tiger. This vector, however, is dissimilar to the distributional vector learned directly from the corpus which captures the lexicalized meaning of semi-precious stone. In Chapter 7 composed representations prove to be the best features for classifying compounds in terms of their semantic relations in setups where simplex words and compounds have representations of the same length. Further analyses show also that some of the modifier information is discarded during the composition process and that extrinsic evaluations tasks such as the semantic classification task are necessary for assessing and improving the quality of the composed representations. Chapter 8 concludes by emphasizing the main contributions of the thesis and sketching directions for future contributions. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.classification Computerlinguistik , Maschinelles Lernen , Kompositum , Semantik de_DE
dc.subject.ddc 004 de_DE
dc.subject.ddc 400 de_DE
dc.subject.ddc 420 de_DE
dc.subject.ddc 430 de_DE
dc.subject.other Computational Linguistics en
dc.subject.other Word Representations en
dc.subject.other Composition Models en
dc.subject.other Distributional Semantics en
dc.subject.other Machine Learning en
dc.subject.other Neural Networks en
dc.subject.other Nominal Compounds en
dc.subject.other Semantic Relations en
dc.title Composition Models for the Representation and Semantic Interpretation of Nominal Compounds en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2019-02-07
utue.publikation.fachbereich Allgemeine u. vergleichende Sprachwissenschaft de_DE
utue.publikation.fakultaet 5 Philosophische Fakultät de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige