Composition Models for the Representation and Semantic Interpretation of Nominal Compounds

Dima, Gina-Corina

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
5 Philosophische Fakultät
→
Dokumentanzeige

dc.contributor.advisor	Hinrichs, Erhard (Prof. Dr.)
dc.contributor.author	Dima, Gina-Corina
dc.date.accessioned	2019-03-19T08:17:39Z
dc.date.available	2019-03-19T08:17:39Z
dc.date.issued	2019-03-19
dc.identifier.other	1662385889	de_DE
dc.identifier.uri	http://hdl.handle.net/10900/87098
dc.identifier.uri	http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-870984	de_DE
dc.identifier.uri	http://dx.doi.org/10.15496/publikation-28485
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-870982	de_DE
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-870981	de_DE
dc.description.abstract	The central topic of this thesis are composition models of distributional semantics and their application for representing the semantics of German and English nominal compounds. Composition models are mathematical transformations that, given a compound like Apfelbaum ‘apple tree’, can be applied to the vector representations of Apfel ‘apple’ and Baum ‘tree’ to obtain a vector representation for the compound Apfelbaum ‘apple tree’. The new composed representation is deemed appropriate if it is similar to the representation of Apfelbaum that can be directly learned from large corpora using distributional methods. The thesis is structured into eight chapters. The first four chapters introduce compounds from a linguistic perspective (Chapter 1), present a review of annotation schemes for nominal compounds and introduce a new hybrid annotation scheme (Chapter 2), introduce neural networks and how to represent words via numerical features (Chapter 3) and detail the distributional representation of words (Chapter 4). Existing composition models of distributional semantics are reviewed and evaluated in Chapter 5. Chapter 5 also introduces three new composition models: addmask, wmask and multimatrix, that aim to improve over existing composition models either though a more efficient parametrization (*mask) or by promoting parameter reuse across different, but semantically similar words (multimatrix). The results show that composition models are able to construct meaningful composed representations for 81.8% of the German test compounds, and 78.03% of the English test compounds. In Chapter 6 composed representations are shown to be a useful indicator when investigating non-compositional (lexicalized) compounds. For example, when modeling a compound like Tigerauge, ‘tiger eye’, composition models will produce a composed representation that corresponds to the literal interpretation of the compound - the eye of a tiger. This vector, however, is dissimilar to the distributional vector learned directly from the corpus which captures the lexicalized meaning of semi-precious stone. In Chapter 7 composed representations prove to be the best features for classifying compounds in terms of their semantic relations in setups where simplex words and compounds have representations of the same length. Further analyses show also that some of the modifier information is discarded during the composition process and that extrinsic evaluations tasks such as the semantic classification task are necessary for assessing and improving the quality of the composed representations. Chapter 8 concludes by emphasizing the main contributions of the thesis and sketching directions for future contributions.	en
dc.language.iso	en	de_DE
dc.publisher	Universität Tübingen	de_DE
dc.rights	ubt-podok	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en	en
dc.subject.classification	Computerlinguistik , Maschinelles Lernen , Kompositum , Semantik	de_DE
dc.subject.ddc	004	de_DE
dc.subject.ddc	400	de_DE
dc.subject.ddc	420	de_DE
dc.subject.ddc	430	de_DE
dc.subject.other	Computational Linguistics	en
dc.subject.other	Word Representations	en
dc.subject.other	Composition Models	en
dc.subject.other	Distributional Semantics	en
dc.subject.other	Machine Learning	en
dc.subject.other	Neural Networks	en
dc.subject.other	Nominal Compounds	en
dc.subject.other	Semantic Relations	en
dc.title	Composition Models for the Representation and Semantic Interpretation of Nominal Compounds	en
dc.type	PhDThesis	de_DE
dcterms.dateAccepted	2019-02-07
utue.publikation.fachbereich	Allgemeine u. vergleichende Sprachwissenschaft	de_DE
utue.publikation.fakultaet	5 Philosophische Fakultät	de_DE

Dateien:	dissertation_CorinaDima.pdf 2.45 MB PDF Beschreibung: Thesis PDF

Das Dokument erscheint in:

5 Philosophische Fakultät [1771]

Zur Kurzanzeige

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

Composition Models for the Representation and Semantic Interpretation of Nominal Compounds

DSpace Repositorium (Manakin basiert)

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto