Please use this identifier to cite or link to this item: http://doi.org/10.25358/openscience-9174
Authors: Hauptmann, Tony
Kramer, Stefan
Title: A fair experimental comparison of neural network architectures for latent representations of multi-omics for drug response prediction
Online publication date: 14-Jun-2023
Year of first publication: 2023
Language: english
Abstract: Background Recent years have seen a surge of novel neural network architectures for the integration of multi-omics data for prediction. Most of the architectures include either encoders alone or encoders and decoders, i.e., autoencoders of various sorts, to transform multi-omics data into latent representations. One important parameter is the depth of integration: the point at which the latent representations are computed or merged, which can be either early, intermediate, or late. The literature on integration methods is growing steadily, however, close to nothing is known about the relative performance of these methods under fair experimental conditions and under consideration of different use cases. Results We developed a comparison framework that trains and optimizes multi-omics integration methods under equal conditions. We incorporated early integration, PCA and four recently published deep learning methods: MOLI, Super.FELT, OmiEmbed, and MOMA. Further, we devised a novel method, Omics Stacking, that combines the advantages of intermediate and late integration. Experiments were conducted on a public drug response data set with multiple omics data (somatic point mutations, somatic copy number profiles and gene expression profiles) that was obtained from cell lines, patient-derived xenografts, and patient samples. Our experiments confirmed that early integration has the lowest predictive performance. Overall, architectures that integrate triplet loss achieved the best results. Statistical differences can, overall, rarely be observed, however, in terms of the average ranks of methods, Super.FELT is consistently performing best in a cross-validation setting and Omics Stacking best in an external test set setting. Conclusions We recommend researchers to follow fair comparison protocols, as suggested in the paper. When faced with a new data set, Super.FELT is a good option in the cross-validation setting as well as Omics Stacking in the external test set setting. Statistical significances are hardly observable, despite trends in the algorithms’ rankings. Future work on refined methods for transfer learning tailored for this domain may improve the situation for external test sets. The source code of all experiments is available under https://github.com/kramerlab/Multi-Omics_analysis
DDC: 004 Informatik
004 Data processing
Institution: Johannes Gutenberg-Universität Mainz
Department: FB 08 Physik, Mathematik u. Informatik
Place: Mainz
ROR: https://ror.org/023b0x485
DOI: http://doi.org/10.25358/openscience-9174
Version: Published version
Publication type: Zeitschriftenaufsatz
Document type specification: Scientific article
License: CC BY
Information on rights of use: https://creativecommons.org/licenses/by/4.0/
Journal: BMC Bioinformatics
24
Pages or article number: 45
Publisher: Springer
Publisher place: London
Issue date: 2023
ISSN: 1471-2105
Publisher DOI: 10.1186/s12859-023-05166-7
Appears in collections:DFG-491381577-G

Files in This Item:
  File Description SizeFormat
Thumbnail
a_fair_experimental_compariso-20230614103216807.pdf1.64 MBAdobe PDFView/Open