Machine learning models of cell differentiation processes with single-cell transcriptomic measurements

DSpace Repository


Dokumentart: PhDThesis
Date: 2023-09-29
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Claassen, Manfred (Prof. Dr.)
Day of Oral Examination: 2023-09-25
DDC Classifikation: 004 - Data processing and computer science
Other Keywords:
Order a printed copy: Print-on-Demand
Show full item record


Dynamic biological phenomena such as the development of immunity due to vaccination or the division of a single zygote into the 37 trillion cells in an adult human are triggered and driven by bio-molecular interactions. The bio-molecular species involved in these interactions are categorised based on their molecular properties and physiological function. Typically, the abundance or characteristics of only a single category of molecular species are measured in experimental protocols, and the data generated is noisy, biased and incomplete. Due to the limitations of measurement technology, computational models cannot represent bio-molecular interactions in full mechanistic detail and have to be restricted to operational definitions of complex biological phenomena. Despite these constraints, computational models tailored to the idiosyncracies of data generated by various technologies enable the identification of bio-molecular species and interactions relevant to particular biological processes. A cell is composed of various bio-molecular species such as nucleic acids, proteins, metabolites etc. The entire bio-molecular composition of a cell is known as a cell-state. mRNA are polymeric bio-molecules whose sequence encodes information for the production of proteins. While proteins are ultimately responsible for the execution of cellular functions, mRNA can be measured much more comprehensively with single-cell RNA sequencing technology. mRNA sequences corresponding to different protein segments are called transcripts, and the relative abundance of the various transcripts indicates the functional properties of the cell. Therefore, the cell-state can be approximated as a vector of mRNA transcript abundance. The change of the cell-state over the course of a biological process is called differentiation. This thesis presents three models of cell differentiation and their application for different scRNAseq. experimental protocols and discovery goals. The first two models are based on the simulation of cell differentiation with Markov chains. The first model provides a generally applicable trajectory inference approach to model differentiation in any biological system with no topological constraints. The second model utilises simulations to model differentiation as a latent state-space process and is used to cluster cells based on transcriptional activity in order to identify transitional cell-states. The third model is based on ordinal logistic regression and is used to identify transcripts whose expression varies along a specified ordinal axis, even in data with other prominent sources of variation.

This item appears in the following Collection(s)