Query Flattening and the Nested Data Parallelism Paradigm

Ulrich, Alexander

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
7 Mathematisch-Naturwissenschaftliche Fakultät
→
Dokumentanzeige

dc.contributor.advisor	Grust, Torsten (Prof. Dr.)
dc.contributor.author	Ulrich, Alexander
dc.date.accessioned	2019-04-10T13:31:11Z
dc.date.available	2019-04-10T13:31:11Z
dc.date.issued	2019-04-10
dc.identifier.other	1663144753	de_DE
dc.identifier.uri	http://hdl.handle.net/10900/87698
dc.identifier.uri	http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-876988	de_DE
dc.identifier.uri	http://dx.doi.org/10.15496/publikation-29084
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-876982	de_DE
dc.description.abstract	This work is based on the observation that languages for two seemingly distant domains are closely related. Orthogonal query languages based on comprehension syntax admit various forms of query nesting to construct nested query results and express complex predicates. Languages for nested data parallelism allow to nest parallel iterators and thereby admit the parallel evaluation of computations that are themselves parallel. Both kinds of languages center around the application of side-effect-free functions to each element of a collection. The motivation for this work is the seamless integration of relational database queries with programming languages. In frameworks for language-integrated database queries, a host language's native collection-programming API is used to express queries. To mediate between native collection programming and relational queries, we define an expressive, orthogonal query calculus that supports nesting and order. The challenge of query flattening is to translate this calculus to bundles of efficient relational queries restricted to flat, unordered multisets. Prior approaches to query flattening either support only query languages that lack in expressiveness or employ a complex, monolithic translation that is hard to comprehend and generates inefficient code that is hard to optimize. To improve on those approaches, we draw on the similarity to nested data parallelism. Blelloch's flattening transformation is a static program transformation that translates nested data parallelism to flat data parallel programs over flat arrays. Based on the flattening transformation, we describe a pipeline of small, comprehensible lowering steps that translates our nested query calculus to a bundle of relational queries. The pipeline is based on a number of well-defined intermediate languages. Our translation adopts the key concepts of the flattening transformation but is designed with specifics of relational query processing in mind. Based on this translation, we revisit all aspects of query flattening. Our translation is fully compositional and can translate any term of the input language. Like prior work, the translation by itself produces inefficient code due to compositionality that is not fit for execution without optimization. In contrast to prior work, we show that query optimization is orthogonal to flattening and can be performed before flattening. We employ well-known work on logical query optimization for nested query languages and demonstrate that this body of work integrates well with our approach. Furthermore, we describe an improved encoding of ordered and nested collections in terms of flat, unordered multisets. Our approach emits idiomatic relational queries in which the effort required to maintain the non-relational semantics of the source language (order and nesting) is minimized. A set of experiments provides evidence that our approach to query flattening can handle complex, list-based queries with nested results and nested intermediate data well. We apply our approach to a number of flat and nested benchmark queries and compare their runtime with hand-written SQL queries. In these experiments, our SQL code generated from a list-based nested query language usually performs as well as hand-written queries.	en
dc.language.iso	en	de_DE
dc.publisher	Universität Tübingen	de_DE
dc.rights	ubt-podok	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en	en
dc.subject.classification	HASKELL , LINQ , SQL , Abfragesprache , Datenbanksprache , Abfrageverarbeitung , Datenbank , Programmiersprache , Parallelverarbeitung	de_DE
dc.subject.ddc	004	de_DE
dc.subject.other	NESL	en
dc.subject.other	flattening transformation	en
dc.subject.other	query unnesting	en
dc.subject.other	query optimization	en
dc.title	Query Flattening and the Nested Data Parallelism Paradigm	en
dc.type	PhDThesis	de_DE
dcterms.dateAccepted	2018-11-21
utue.publikation.fachbereich	Informatik	de_DE
utue.publikation.fakultaet	7 Mathematisch-Naturwissenschaftliche Fakultät	de_DE

Dateien:	ulrich-phd-publish.pdf 1.15 MB PDF

Das Dokument erscheint in:

7 Mathematisch-Naturwissenschaftliche Fakultät [5309]

Zur Kurzanzeige

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

Query Flattening and the Nested Data Parallelism Paradigm

DSpace Repositorium (Manakin basiert)

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto