Composition Models for the Representation and Semantic Interpretation of Nominal Compounds

Dima, Gina-Corina

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
5 Philosophische Fakultät
→
Dokumentanzeige

« zurück

Composition Models for the Representation and Semantic Interpretation of Nominal Compounds

Dima, Gina-Corina

Dateien:	dissertation_CorinaDima.pdf 2.45 MB PDF Beschreibung: Thesis PDF

Zitierfähiger Link (URI):	http://hdl.handle.net/10900/87098 http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-870984 http://dx.doi.org/10.15496/publikation-28485 http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-870982 http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-870981
Dokumentart:	Dissertation
Erscheinungsdatum:	2019-03-19
Sprache:	Englisch
Fakultät:	5 Philosophische Fakultät
Fachbereich:	Allgemeine u. vergleichende Sprachwissenschaft
Gutachter:	Hinrichs, Erhard (Prof. Dr.)
Tag der mündl. Prüfung:	2019-02-07
DDC-Klassifikation:	004 - Informatik 400 - Sprache, Linguistik 420 - Englisch 430 - Deutsch
Schlagworte:	Computerlinguistik , Maschinelles Lernen , Kompositum , Semantik
Freie Schlagwörter:	Computational Linguistics Word Representations Composition Models Distributional Semantics Machine Learning Neural Networks Nominal Compounds Semantic Relations
Lizenz:	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en
Gedruckte Kopie bestellen:	Print-on-Demand
Zur Langanzeige

Abstract:

The central topic of this thesis are composition models of distributional semantics and their application for representing the semantics of German and English nominal compounds. Composition models are mathematical transformations that, given a compound like Apfelbaum ‘apple tree’, can be applied to the vector representations of Apfel ‘apple’ and Baum ‘tree’ to obtain a vector representation for the compound Apfelbaum ‘apple tree’. The new composed representation is deemed appropriate if it is similar to the representation of Apfelbaum that can be directly learned from large corpora using distributional methods. The thesis is structured into eight chapters. The first four chapters introduce compounds from a linguistic perspective (Chapter 1), present a review of annotation schemes for nominal compounds and introduce a new hybrid annotation scheme (Chapter 2), introduce neural networks and how to represent words via numerical features (Chapter 3) and detail the distributional representation of words (Chapter 4). Existing composition models of distributional semantics are reviewed and evaluated in Chapter 5. Chapter 5 also introduces three new composition models: addmask, wmask and multimatrix, that aim to improve over existing composition models either though a more efficient parametrization (*mask) or by promoting parameter reuse across different, but semantically similar words (multimatrix). The results show that composition models are able to construct meaningful composed representations for 81.8% of the German test compounds, and 78.03% of the English test compounds. In Chapter 6 composed representations are shown to be a useful indicator when investigating non-compositional (lexicalized) compounds. For example, when modeling a compound like Tigerauge, ‘tiger eye’, composition models will produce a composed representation that corresponds to the literal interpretation of the compound - the eye of a tiger. This vector, however, is dissimilar to the distributional vector learned directly from the corpus which captures the lexicalized meaning of semi-precious stone. In Chapter 7 composed representations prove to be the best features for classifying compounds in terms of their semantic relations in setups where simplex words and compounds have representations of the same length. Further analyses show also that some of the modifier information is discarded during the composition process and that extrinsic evaluations tasks such as the semantic classification task are necessary for assessing and improving the quality of the composed representations. Chapter 8 concludes by emphasizing the main contributions of the thesis and sketching directions for future contributions.

Das Dokument erscheint in:

5 Philosophische Fakultät [1735]

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

Composition Models for the Representation and Semantic Interpretation of Nominal Compounds

DSpace Repositorium (Manakin basiert)

Composition Models for the Representation and Semantic Interpretation of Nominal Compounds

Abstract:

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto