Unsupervised identification of metastable molecular conformations with deep learning methods
| dc.contributor.advisor | Speck, Thomas | |
| dc.contributor.author | Lemcke, Simon | |
| dc.date.accessioned | 2025-10-20T08:32:32Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | The rise of compute power over the last decades, best described by Moore’s empirical law, has made it possible to establish simulation as the third pillar of science in between the longstanding pillars of theory and experiment. Investigating systems ’in- silico’ has since then become a wide-spread approach to research, enabling numerical insights on scales not accessible to theory and experiment. In recent years, artificial intelligence and machine learning, specifically deep learning, has emerged as one of the key technologies of the information age, fueled by the abundant availability of computation and data. In this thesis, we show in two case studies that a deep learning approach to dimen- sionality reduction, called EncoderMap, is able to find better, more descriptive col- lective variables in the same amount of dimensions than established linear methods. In the main chapter, we concern ourselves with improving the analysis of simulation data by incorporating this deep learning method. Simulation can be considered as an experiment conducted on a computer that creates a lot of raw data from which insights can only be extracted in a nontrivial manner. This analysis follows an elab- orate modeling pipeline, which consists of multiple steps and algorithms. One of these crucial steps is dimensionality reduction, in which high-dimensional data is mapped into a lower-dimensional space, retaining as much of the important informa- tion as possible and aiming to find descriptive collective variables fit for modeling. We show with a well-studied small peptide, deca-alanine, that the aforementioned deep autoencoder architecture with an additional distance metric - EncoderMap - allows to find collective variables that are at least as good as an established linear method - TICA - in the same amount of dimensions. Connecting results, obtained by simulation, back to experiment is done by identifying metastable states, long- lived structural conformations that are accessible to experiment. We compare these dimensionality reduction methods in their capabilities to find expressive collective variables that allow to find these metastable states. Lastly, as EncoderMap does not make use of the time-series character of the data and works on structure alone, our results hint towards potential applications in combination with algorithms that allow to harvest unordered data fast, e.g. Monte Carlo simulations. | en |
| dc.identifier.doi | https://doi.org/10.25358/openscience-13426 | |
| dc.identifier.uri | https://openscience.ub.uni-mainz.de/handle/20.500.12030/13447 | |
| dc.identifier.urn | urn:nbn:de:hebis:77-b0c7375c-93d0-4e36-84be-d58c17196f593 | |
| dc.language.iso | eng | |
| dc.rights | CC-BY-4.0 | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
| dc.subject.ddc | 530 Physik | de |
| dc.subject.ddc | 530 Physics | en |
| dc.title | Unsupervised identification of metastable molecular conformations with deep learning methods | en |
| dc.type | Dissertation | |
| jgu.date.accepted | 2025-09-26 | |
| jgu.description.extent | ix, 142 Seiten ; Illustrationen, Diagramme | |
| jgu.identifier.uuid | b0c7375c-93d0-4e36-84be-d58c17196f59 | |
| jgu.organisation.department | FB 08 Physik, Mathematik u. Informatik | |
| jgu.organisation.name | Johannes Gutenberg-Universität Mainz | |
| jgu.organisation.number | 7940 | |
| jgu.organisation.place | Mainz | |
| jgu.organisation.ror | https://ror.org/023b0x485 | |
| jgu.rights.accessrights | openAccess | |
| jgu.subject.ddccode | 530 | |
| jgu.type.dinitype | PhDThesis | en_GB |
| jgu.type.resource | Text | |
| jgu.type.version | Original work |