Instance Segmentation and 3D Multi-Object Tracking for Autonomous Driving

DSpace Repositorium (Manakin basiert)


Dateien:

Zitierfähiger Link (URI): http://hdl.handle.net/10900/140930
http://nbn-resolving.de/urn:nbn:de:bsz:21-dspace-1409304
http://dx.doi.org/10.15496/publikation-82277
Dokumentart: Dissertation
Erscheinungsdatum: 2023-05-09
Sprache: Englisch
Fakultät: 7 Mathematisch-Naturwissenschaftliche Fakultät
Fachbereich: Informatik
Gutachter: Zell, Andreas (Prof. Dr.)
Tag der mündl. Prüfung: 2023-03-09
DDC-Klassifikation: 004 - Informatik
Freie Schlagwörter:
Autonomous Driving
3D Multi-Object Tracking
Instance Segmentation
Lizenz: http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=de http://tobias-lib.uni-tuebingen.de/doku/lic_mit_pod.php?la=en
Gedruckte Kopie bestellen: Print-on-Demand
Zur Langanzeige

Abstract:

Autonomous driving promises to change the way we live. It could save lives, provide mobility, reduce wasted time driving, and enable new ways to design our cities. One crucial component in an autonomous driving system is perception, understanding the environment around the car to take proper driving commands. This dissertation focuses on two perception tasks: instance segmentation and 3D multi-object tracking (MOT). In instance segmentation, we discuss different mask representations and propose representing the mask’s boundary as Fourier series. We show that this implicit representation is compact and fast and gives the highest mAP for a small number of parameters on the dataset MS COCO. Furthermore, during our work on instance segmentation, we found that the Fourier series is linked with the emerging field of implicit neural representations (INR). We show that the general form of the Fourier series is a Fourier-mapped perceptron with integer frequencies. As a result, we know that one perceptron is enough to represent any signal if the Fourier mapping matrix has enough frequencies. Furthermore, we used INR to represent masks in instance segmentation and got results better than the dominant grid mask representation. In 3D MOT, we focus on tracklet management systems, classifying them into count-based and confidence-based systems. We found that the score update functions used previously for confidence-based systems are not optimal. Therefore, we propose better score update functions that give better score estimates. In addition, we used the same technique for the late fusion of object detectors. Finally, we tested our algorithm on the NuScenes and Waymo datasets, giving a consistent AMOTA boost.

Das Dokument erscheint in: