Publications
Journal Article
Analyzing postprandial metabolomics data using multiway models: A simulation study
bioRxiv (2023).Status: Submitted
Analyzing postprandial metabolomics data using multiway models: A simulation study
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery , TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion |
Publication Type | Journal Article |
Year of Publication | 2023 |
Journal | bioRxiv |
Publisher | bioRxiv |
URL | https://www.biorxiv.org/content/10.1101/2022.12.19.521154v2 |
DOI | 10.1101/2022.12.19.521154 |
Characterizing human postprandial metabolic response using multiway data analysis
bioRxiv (2023).Status: Submitted
Characterizing human postprandial metabolic response using multiway data analysis
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery , TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion |
Publication Type | Journal Article |
Year of Publication | 2023 |
Journal | bioRxiv |
Publisher | biorxiv |
DOI | 10.1101/2023.08.31.555521 |
Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions
WIREs Data Mining and Knowledge Discovery 13 (2023).Status: Published
Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery , DeCipher |
Publication Type | Journal Article |
Year of Publication | 2023 |
Journal | WIREs Data Mining and Knowledge Discovery |
Volume | 13 |
Number | e1494 |
Publisher | Wiley |
DOI | 10.1002/widm.1494 |
Talks, contributed
Analyzing postprandial metabolomics data using multiway models: A simulation study
In Norwegian Bioinformatics Days, Sundvolden, Norway, 2022.Status: Published
Analyzing postprandial metabolomics data using multiway models: A simulation study
Afilliation | Machine Learning |
Project(s) | TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion, Department of Data Science and Knowledge Discovery |
Publication Type | Talks, contributed |
Year of Publication | 2022 |
Location of Talk | Norwegian Bioinformatics Days, Sundvolden, Norway |
Analyzing postprandial metabolomics data using multiway models: A simulation study
In Nordic Metabolomics 2022, Copenhagen, Denmark, 2022.Status: Published
Analyzing postprandial metabolomics data using multiway models: A simulation study
Afilliation | Machine Learning |
Project(s) | TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion, Department of Data Science and Knowledge Discovery |
Publication Type | Talks, contributed |
Year of Publication | 2022 |
Location of Talk | Nordic Metabolomics 2022, Copenhagen, Denmark |
Analyzing postprandial metabolomics data using multiway models: A simulation study
In NuGOweek 2022 in Spain, 2022.Status: Published
Analyzing postprandial metabolomics data using multiway models: A simulation study
Afilliation | Machine Learning |
Project(s) | TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion, Department of Data Science and Knowledge Discovery |
Publication Type | Talks, contributed |
Year of Publication | 2022 |
Location of Talk | NuGOweek 2022 in Spain |
Characterizing postprandial metabolomics response using multi-way data analysis
In Annual NORBIS Conference, 2022.Status: Published
Characterizing postprandial metabolomics response using multi-way data analysis
Analysis of time-resolved postprandial metabolomics data can enhance our knowledge about the human metabolism by providing a better understanding of regulation of subgroups of metabolites (e.g., lipids) and variations in postprandial responses of subgroups of people, with the potential to ultimately advance precision medicine. However, characterizing postprandial metabolomics response and understanding group differences is a challenging task since it requires the analysis of large-scale metabolomics data from a large set of individuals containing measurements of a wide set of metabolites at multiple time points. Such data is in the form of a three-way array: subjects by metabolites by time points. The state-of-the-art analysis methods mainly focus on clustering temporal profiles relying on summaries of the data across subjects or univariate analysis techniques studying one metabolite at a time, and fail to associate subgroups of subjects and subsets of metabolites with the dynamic time profile simultaneously.
In this study, we use NMR (nuclear magnetic resonance) spectroscopy measurements of plasma samples (of over three hundred individuals from the COPSAC2000 cohort) collected at multiple time points during a challenge test. We use a multi-way analysis technique called the CANDECOMP/PARAFAC (CP) model to extract interpretable patterns from the time-resolved data. We compare the analysis of postprandial data, fasting state-corrected data and only fasting state data, and demonstrate the differences between different analysis approaches.
Our results show that the CP model reveals biologically meaningful patterns capturing how certain metabolite groups and their temporal profiles relate to various meta variables, in particular, BMI (body mass index), confirming already known biological knowledge as well as revealing new biological insights.
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery , TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion |
Publication Type | Talks, contributed |
Year of Publication | 2022 |
Location of Talk | Annual NORBIS Conference |
Poster
Characterizing postprandial metabolic response using multi-way data analysis
Norwegian Bioinformatics Days, 2022.Status: Published
Characterizing postprandial metabolic response using multi-way data analysis
Analysis of time-resolved postprandial metabolomics data can enhance our knowledge about the human metabolism by providing a better understanding of regulation of subgroups of metabolites (e.g., lipids) and variations in postprandial responses of subgroups of people, with the potential to ultimately advance precision medicine. However, characterizing postprandial metabolomics response and understanding group differences is a challenging task since it requires the analysis of large-scale metabolomics data from a large set of individuals containing measurements of a wide set of metabolites at multiple time points. Such data is in the form of a three-way array: subjects by metabolites by time points. The state-of-the-art analysis methods mainly focus on clustering temporal profiles relying on summaries of the data across subjects or univariate analysis techniques studying one metabolite at a time, and fail to associate subgroups of subjects and subsets of metabolites with the dynamic time profile simultaneously.
In this study, we use NMR (nuclear magnetic resonance) spectroscopy measurements of plasma samples (of over three hundred individuals from the COPSAC2000 cohort) collected at multiple time points during a challenge test. We use a multi-way analysis technique called the CANDECOMP/PARAFAC (CP) model to extract interpretable patterns from the time-resolved data. We compare the analysis of postprandial data, fasting state-corrected data and only fasting state data, and demonstrate the differences between different analysis approaches.
Our results show that the CP model reveals biologically meaningful patterns capturing how certain metabolite groups and their temporal profiles relate to various meta variables, in particular, BMI (body mass index), confirming already known biological knowledge as well as revealing new biological insights.
Afilliation | Machine Learning |
Project(s) | TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion, Department of Data Science and Knowledge Discovery |
Publication Type | Poster |
Year of Publication | 2022 |
Place Published | Norwegian Bioinformatics Days |
Keywords | CANDECOMP/PARAFAC, Dynamic metabolomics data, large-scale dataset, Tensor factorization |
Revealing dynamic changes in metabolism through the analysis of postprandial metabolomics data: A simulation study
Metabolomics 2022, Valencia, Spain, 2022.Status: Published
Revealing dynamic changes in metabolism through the analysis of postprandial metabolomics data: A simulation study
Afilliation | Machine Learning |
Project(s) | TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion, Department of Data Science and Knowledge Discovery |
Publication Type | Poster |
Year of Publication | 2022 |
Place Published | Metabolomics 2022, Valencia, Spain |
Journal Article
Exploring Dynamic Metabolomics Data With Multiway Data Analysis: a Simulation Study
BMC Bioinformatics 23 (2022).Status: Published
Exploring Dynamic Metabolomics Data With Multiway Data Analysis: a Simulation Study
Background: Analysis of dynamic metabolomics data holds the promise to improve our understanding of underlying mechanisms in metabolism. For example, it may detect changes in metabolism due to the onset of a disease. Dynamic or time-resolved metabolomics data can be arranged as a three-way array with entries organized according to a subjects mode, a metabolites mode and a time mode. While such time-evolving multiway data sets are increasingly collected, revealing the underlying mechanisms and their dynamics from such data remains challenging. For such data, one of the complexities is the presence of a superposition of several sources of variation: induced variation (due to experimental conditions or inborn errors), individual variation, and measurement error. Multiway data analysis (also known as tensor factorizations) has been successfully used in data mining to find the underlying patterns in multiway data. In this paper, we study the use of multiway data analysis to reveal the underlying patterns and dynamics in time-resolved metabolomics data.
Results: We focus on simulated data arising from different dynamic models of increasing complexity, i.e., a simple linear system, a yeast glycolysis model, and a human cholesterol model. We generate data with induced variation as well as individual variation. Systematic experiments are performed to demonstrate the advantages and limitations of multiway data analysis in analyzing such dynamic metabolomics data and their capacity to disentangle the different sources of variations. We choose to use simulations since we want to understand the capability of multiway data analysis methods which is facilitated by knowing the ground truth.
Conclusion: Our numerical experiments demonstrate that despite the increasing complexity of the studied dynamic metabolic models, tensor factorization methods CANDECOMP/PARAFAC(CP) and Parallel Profiles with Linear Dependences (Paralind) can disentangle the sources of variations and thereby reveal the underlying mechanisms and their dynamics.
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery , TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion |
Publication Type | Journal Article |
Year of Publication | 2022 |
Journal | BMC Bioinformatics |
Volume | 23 |
Number | Article 31 |
Date Published | 2022 |
Publisher | Springer |
DOI | 10.1186/s12859-021-04550-5 |
Book
Multiblock Data Fusion in Statistics and Machine Learning: Applications in the Natural and Life Sciences
Chichester, UK: John Wiley & Sons, 2022.Status: Published
Multiblock Data Fusion in Statistics and Machine Learning: Applications in the Natural and Life Sciences
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery |
Publication Type | Book |
Year of Publication | 2022 |
Number of Pages | 416 |
Publisher | John Wiley & Sons |
Place Published | Chichester, UK |
ISBN Number | ISBN: 978-1-119-60097-8 |
Proceedings, refereed
Phenotyping of cervical cancer risk groups via generalized low-rank models using medical questionnaires
In Norwegian AI Symposium: Nordic Artificial Intelligence Research and Development. Springer, 2022.Status: Published
Phenotyping of cervical cancer risk groups via generalized low-rank models using medical questionnaires
Afilliation | Machine Learning |
Project(s) | DeCipher, Department of Data Science and Knowledge Discovery |
Publication Type | Proceedings, refereed |
Year of Publication | 2022 |
Conference Name | Norwegian AI Symposium: Nordic Artificial Intelligence Research and Development |
Pagination | 94--110 |
Publisher | Springer |
DOI | 10.1007/978-3-031-17030-0_8 |
Talks, contributed
Exploring dynamic metabolomics data with multiway data analysis: A simulation study
In Virtual conference. SIAM Conference on Applications of Dynamical Systems, 2021.Status: Published
Exploring dynamic metabolomics data with multiway data analysis: A simulation study
Analysis of dynamic metabolomics data sets holds the promise to improve our understanding of the underlying mechanisms in human metabolism. That is crucial to detect the changes in the metabolism that can potentially lead to diseases. Dynamic metabolomics data has more than two axes of variation, i.e., samples, metabolites and time. While such time-evolving multi-way data sets are collected more and more in recent years, revealing the underlying mechanisms and their dynamics from such data remains challenging.
This talk will focus on a systematic study demonstrating the advantages and limitations of multi-way data analysis (also known as tensor factorizations) in terms of analyzing dynamic metabolomics data. We study different dynamic models of increasing complexity, i.e., a simple linear system, a yeast glycolysis model, a human cholesterol model, and generate data with different types of variation. Our numerical experiments demonstrate that despite the increasing complexity of the studied models, tensor factorization methods CANDECOMP/PARAFAC(CP) and PARAllel Profiles with LINear Dependences (PARALIND) can reveal the underlying mechanisms and their dynamics.
Afilliation | Machine Learning |
Project(s) | TrACEr: Time-Aware ConstrainEd Multimodal Data Fusion, Department of Data Science and Knowledge Discovery |
Publication Type | Talks, contributed |
Year of Publication | 2021 |
Location of Talk | Virtual conference |
Publisher | SIAM Conference on Applications of Dynamical Systems |
Generalized Low-Rank Models for Phenotyping Cervical Cancer Risk Groups using Medical Questionnaires
In Stavanger, Norway, 2021.Status: Published
Generalized Low-Rank Models for Phenotyping Cervical Cancer Risk Groups using Medical Questionnaires
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery , DeCipher |
Publication Type | Talks, contributed |
Year of Publication | 2021 |
Location of Talk | Stavanger, Norway |
Journal Article
Common and distinct components in data fusion
Journal of Chemometrics 31, no. 7 (2017): e2900.Status: Published
Common and distinct components in data fusion
Afilliation | Machine Learning |
Project(s) | Department of Data Science and Knowledge Discovery |
Publication Type | Journal Article |
Year of Publication | 2017 |
Journal | Journal of Chemometrics |
Volume | 31 |
Issue | 7 |
Pagination | e2900 |
Date Published | Jan-07-2017 |
Publisher | Wiley |
URL | http://doi.wiley.com/10.1002/cem.v31.7http://doi.wiley.com/10.1002/cem.2... |
DOI | 10.1002/cem.v31.710.1002/cem.2900 |