Detached Provenance Analysis

DSpace Repositorium (Manakin basiert)

Zur Kurzanzeige

dc.contributor.advisor Grust, Torsten (Prof. Dr.)
dc.contributor.author Müller, Tobias
dc.date.accessioned 2020-03-31T13:25:40Z
dc.date.available 2020-03-31T13:25:40Z
dc.date.issued 2020-03-31
dc.identifier.other 1693721260 de_DE
dc.identifier.uri http://hdl.handle.net/10900/99433
dc.identifier.uri http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-994336 de_DE
dc.identifier.uri http://dx.doi.org/10.15496/publikation-40814
dc.description.abstract Data provenance is the research field of the algorithmic derivation of the source and processing history of data. In this work, the derivation of Where- and Why-provenance in sub-cell-level granularity is pursued for a rich SQL dialect. For example, we support the provenance analysis for individual elements of nested rows and/or arrays. The SQL dialect incorporates window functions and correlated subqueries. We accomplish this goal using a novel method called detached provenance analysis. This method carries out a SQL-level rewrite of any user query Q, yielding (Q1, Q2). Employing two queries facilitates a low-invasive provenance analysis, i.e. both queries can be evaluated using an unmodified DBMS as backend. The queries implement a split of responsibilities: Q1 carries out a runtime analysis and Q2 derives the actual data provenance. One drawback of this method is that a synchronization overhead between Q1 and Q2 is induced. Experiments quantify the overheads based on the TPC-H benchmark and the PostgreSQL DBMS. A second set of experiments carried out in row–level granularity compares our approach with the PERM approach (as described by B. Glavic et al.). The aggregated results show that basic queries (typically, a single SFW expression with aggregations) perform slightly better in the PERM approach while complex queries (nested SFW expressions and correlated subqueries) perform considerably better in our approach. en
dc.language.iso en de_DE
dc.publisher Universität Tübingen de_DE
dc.rights ubt-podok de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de de_DE
dc.rights.uri http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en en
dc.subject.classification Datenbank , SQL , Datenherkunft de_DE
dc.subject.ddc 004 de_DE
dc.subject.other Data Provenance en
dc.title Detached Provenance Analysis en
dc.type PhDThesis de_DE
dcterms.dateAccepted 2020-03-05
utue.publikation.fachbereich Informatik de_DE
utue.publikation.fakultaet 7 Mathematisch-Naturwissenschaftliche Fakultät de_DE

Dateien:

Das Dokument erscheint in:

Zur Kurzanzeige