Partial least squares methods for cross sectional omics data integration

Relatore: Jeanne Jacobine Duistermaat (University of Bologna)

Data: 07 aprile 2022 dalle 16:00 alle 17:00
Luogo: Aula 22 secondo piano, Piazza Scaravilli, 2 e Modalità telematica, mediante sistema di videoconferenza su piattaforma Microsoft Teams

Abstract
Many studies collect multiple omics datasets to gather novel insights about different stages of biological processes and to associate omic features with outcome variables. For joint modelling of omic datasets, several data integration methods have been developed. We have proposed a probabilistic latent variable modelling framework for inferring the relationship between two omics datasets. These methods reduce dimensionality and address the presence of heterogeneity among datasets due to representing different biological processes and using different measurement technologies. The correlation structure is modelled by joint and data specific components. An extension of the model includes the relationship between the joint components and an outcome variable.

Model parameters are estimated using maximum likelihood. Test statistics are proposed for the null hypothesis of no relationship. We evaluate our methods via simulations. Under the null hypothesis, the test statistics appear to approximately follow the normal distribution. Our method appears to outperform existing methods for small and heterogeneous datasets in terms of selecting relevant variables and prediction accuracy. We illustrate the methods by application to multi omics datasets from a population cohort, cell lines and a case control study.

Organizzatore
Christian Hennig

Collegamento Teams