To err is human? A functional comparison of human and machine decision-making

Geirhos, Robert

Publikationsdienste
→
TOBIAS-lib - Publikationen und Dissertationen
→
7 Mathematisch-Naturwissenschaftliche Fakultät
→
Dokumentanzeige

dc.contributor.advisor	Wichmann, Felix A. (Prof. Dr.)
dc.contributor.author	Geirhos, Robert
dc.date.accessioned	2022-02-25T11:04:48Z
dc.date.available	2022-02-25T11:04:48Z
dc.date.issued	2022-02-25
dc.identifier.uri	http://hdl.handle.net/10900/124854
dc.identifier.uri	http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1248542	de_DE
dc.identifier.uri	http://dx.doi.org/10.15496/publikation-66217
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1248545	de_DE
dc.identifier.uri	http://nbn-resolving.org/urn:nbn:de:bsz:21-dspace-1248540	de_DE
dc.description.abstract	It is hard to imagine what a world without objects would look like. While being able to rapidly recognise objects seems deceptively simple to humans, it has long proven challenging for machines, constituting a major roadblock towards real-world applications. This has changed with recent advances in deep learning: Today, modern deep neural networks (DNNs) often achieve human-level object recognition performance. However, their complexity makes it notoriously hard to understand how they arrive at a decision, which carries the risk that machine learning applications outpace our understanding of machine decisions - without knowing when machines will fail, and why; when machines will be biased, and why; when machines will be successful, and why. We here seek to develop a better understanding of machine decision-making by comparing it to human decision-making. Most previous investigations have compared intermediate representations (such as network activations to neural firing patterns), but ultimately, a machine's behaviour (or output decision) has the most direct relevance: humans are affected by machine decisions, not by "machine thoughts". Therefore, the focus of this thesis and its six constituent projects (P1-P6) is a functional comparison of human and machine decision-making. This is achieved by transferring methods from human psychophysics - a field with a proven track record of illuminating complex visual systems - to modern machine learning. The starting point of our investigations is a simple question: How do DNNs recognise objects, by texture or by shape? Following behavioural experiments with cue-conflict stimuli, we show that the textbook explanation of machine object recognition - an increasingly complex hierarchy based on object parts and shapes - is inaccurate. Instead, standard DNNs simply exploit local image textures (P1). Intriguingly, this difference between humans and DNNs can be overcome through data augmentation: Training DNNs on a suitable dataset induces a human-like shape bias and leads to emerging human-level distortion robustness in DNNs, enabling them to cope with unseen types of image corruptions much better than any previously tested model. Motivated by the finding that texture bias is pervasive throughout object classification and object detection (P2), we then develop "error consistency". Error consistency is an analysis to understand how machine decisions differ from one another depending on, for instance, model architecture or training objective. This analysis reveals remarkable similarities between feedforward vs. recurrent (P3) and supervised vs. self-supervised models (P4). At the same time, DNNs show little consistency with human observers, reinforcing our finding of fundamentally different decision-making between humans and machines. In the light of these results, we then take a step back, asking where these differences may originate from. We find that many DNN shortcomings can be seen as symptoms of the same underlying pattern: "shortcut learning", a tendency to exploit unintended patterns that fail to generalise to unexpected input (P5). While shortcut learning accounts for many functional differences between human and machine perception, some of them can be overcome: In our last investigation, a large-scale behavioural comparison, toolbox and benchmark (P6), we report partial success in closing the gap between human and machine vision. Taken together our findings indicate that our understanding of machine decision-making is riddled with (often untested) assumptions. Putting these on a solid empirical footing, as done here through rigorous quantitative experiments and functional comparisons with human decision-making, is key: for when humans better understand machines, we will be able to build machines that better understand humans - and the world we all share.	en
dc.language.iso	en	de_DE
dc.publisher	Universität Tübingen	de_DE
dc.rights	ubt-podok	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de	de_DE
dc.rights.uri	http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en	en
dc.subject.ddc	004	de_DE
dc.subject.other	machine vision	en
dc.subject.other	human vision	en
dc.subject.other	deep learning	en
dc.subject.other	psychophysics	en
dc.subject.other	object recognition	en
dc.title	To err is human? A functional comparison of human and machine decision-making	en
dc.type	PhDThesis	de_DE
dcterms.dateAccepted	2022-02-16
utue.publikation.fachbereich	Informatik	de_DE
utue.publikation.fakultaet	7 Mathematisch-Naturwissenschaftliche Fakultät	de_DE
utue.publikation.noppn	yes	de_DE

Dateien:	Geirhos_Robert_Dissertation.pdf 45.7 MB PDF Beschreibung: Dissertation (PDF)

Das Dokument erscheint in:

7 Mathematisch-Naturwissenschaftliche Fakultät [4978]

Zur Kurzanzeige

Veröffentlichen

Stöbern

Gesamter Bestand
Diese Sammlung

Mein Benutzerkonto

Einloggen

To err is human? A functional comparison of human and machine decision-making

DSpace Repositorium (Manakin basiert)

Das Dokument erscheint in:

Stöbern

Gesamter Bestand

Diese Sammlung

Mein Benutzerkonto