Please use this identifier to cite or link to this item: http://doi.org/10.25358/openscience-5712
Full metadata record
DC FieldValueLanguage
dc.contributor.authorAndreani, Tommaso-
dc.date.accessioned2021-03-29T13:27:54Z-
dc.date.available2021-03-29T13:27:54Z-
dc.date.issued2021-
dc.identifier.urihttps://openscience.ub.uni-mainz.de/handle/20.500.12030/5721-
dc.description.abstractOne of the big questions in biology today is to understand which genetic and epigenetic factors are involved in the regulation of gene expression, and in which cases their deregulation can contribute to the development of abnormal phenotypes or diseases. Innovations in genome sequencing techniques and corresponding data processing algorithms have enabled unbiased interrogation of the different genomic and epigenomic components of transcription at nucleotide resolution. Therefore, it is now possible to use and integrate different types of data for both bulk and single-cell samples, and to understand the molecular components of gene expression regulation using ad-hoc reproducible computational analysis. As an interdisciplinary field, bioinformatics takes advantage of different quantitative disciplines, such as statistics and machine learning. This allows the implementation of detailed analyses to support and elucidate specific fundamental discoveries, and also to test unexpected predictions coming from exploratory data analysis. In particular, the use of bioinformatics is a necessity in the study of the genomic basis of gene regulation given the complexity of the data produced. Thus, the application of existing and the development of novel bioinformatics methods improves the interpretation of new data by integrating several data types from multiple sources. In this thesis I applied and developed bioinformatics methods to help investigate basic biological questions in the genomic study of epigenetic gene regulation: i) I created a pipeline for whole-genome bisulfite sequencing data analysis to improve the understanding of the way genes and DNA sequences are demethylated by GADD45 proteins and how this might be linked to a key stage of development in mouse embryonic stem cells (mESCs), ii) I developed a metric based on the Gini index to evaluate unsupervised clustering results obtained using several computational methods that were tested to identify various types of peripheral blood mononuclear cells (PBMCs) from single-cell ATAC-seq samples in which the labels of the cells were not provided and iii) I developed an algorithm to extract variable regions in ChIP-seq data that can improve the identification of target-specific binding sites of different proteins in several cell lines of the ENCODE project. Together, these three studies are a significant contribution to the improvement of the interpretation of genomic data for the study of epigenetic gene regulation by bioinformatics.en_GB
dc.language.isoengde
dc.rightsin Copyright*
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/*
dc.subject.ddc500 Naturwissenschaftende_DE
dc.subject.ddc500 Natural sciences and mathematicsen_GB
dc.subject.ddc570 Biowissenschaftende_DE
dc.subject.ddc570 Life sciencesen_GB
dc.titleFrom DNA sequences to cell types by detecting regulatory genomic regions in sequencing dataen_GB
dc.typeDissertationde
dc.identifier.urnurn:nbn:de:hebis:77-openscience-b7189d64-17a4-4d1d-9e6a-dc1175630a1e3-
dc.identifier.doihttp://doi.org/10.25358/openscience-5712-
jgu.type.dinitypedoctoralThesisen_GB
jgu.type.versionOriginal workde
jgu.type.resourceTextde
jgu.date.accepted2020-06-19-
jgu.description.extent167 Seiten, Illustrationende
jgu.organisation.departmentFB 10 Biologiede
jgu.organisation.departmentExterne Einrichtungende
jgu.organisation.year2019-
jgu.organisation.number7970-
jgu.organisation.number0000-
jgu.organisation.nameJohannes Gutenberg-Universität Mainz-
jgu.rights.accessrightsopenAccess-
jgu.organisation.placeMainz-
jgu.subject.ddccode500de
jgu.subject.ddccode570de
Appears in collections:JGU-Publikationen

Files in This Item:
File Description SizeFormat 
andreani_tommaso-from_dna_seque-20210323164932284.pdf3.22 MBAdobe PDFView/Open