Please use this identifier to cite or link to this item: http://doi.org/10.25358/openscience-7432
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGütlein, Martin-
dc.contributor.authorKramer, Stefan-
dc.date.accessioned2022-07-15T09:37:45Z-
dc.date.available2022-07-15T09:37:45Z-
dc.date.issued2016
dc.identifier.urihttps://openscience.ub.uni-mainz.de/handle/20.500.12030/7446-
dc.description.abstractBackground Even though circular fingerprints have been first introduced more than 50 years ago, they are still widely used for building highly predictive, state-of-the-art (Q)SAR models. Historically, these structural fragments were designed to search large molecular databases. Hence, to derive a compact representation, circular fingerprint fragments are often folded to comparatively short bit-strings. However, folding fingerprints introduces bit collisions, and therefore adds noise to the encoded structural information and removes its interpretability. Both representations, folded as well as unprocessed fingerprints, are often used for (Q)SAR modeling. Results We show that it can be preferable to build (Q)SAR models with circular fingerprint fragments that have been filtered by supervised feature selection, instead of applying folded or all fragments. Compared to folded fingerprints, filtered fingerprints significantly increase predictive performance and remain unambiguous and interpretable. Compared to unprocessed fingerprints, filtered fingerprints reduce the computational effort and are a more compact and less redundant feature representation. Depending on the selected learning algorithm filtering yields about equally predictive (Q)SAR models. We demonstrate the suitability of filtered fingerprints for (Q)SAR modeling by presenting our freely available web service Collision-free Filtered Circular Fingerprints that provides rationales for predictions by highlighting important structural features in the query compound (see http://coffer.informatik.uni-mainz.de). Conclusions Circular fingerprints are potent structural features that yield highly predictive models and encode interpretable structural information. However, to not lose interpretability, circular fingerprints should not be folded when building prediction models. Our experiments show that filtering is a suitable option to reduce the high computational effort when working with all fingerprint fragments. Additionally, our experiments suggest that the area under precision recall curve is a more sensible statistic for validating (Q)SAR models for virtual screening than the area under ROC or other measures for early recognition.en_GB
dc.description.sponsorshipDFG, Open Access-Publizieren Universität Mainz / Universitätsmedizinde
dc.language.isoengde
dc.rightsCC BY*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subject.ddc004 Informatikde_DE
dc.subject.ddc004 Data processingen_GB
dc.titleFiltered circular fingerprints improve either prediction or runtime performance while retaining interpretabilityen_GB
dc.typeZeitschriftenaufsatzde
dc.identifier.doihttp://doi.org/10.25358/openscience-7432-
jgu.type.dinitypearticleen_GB
jgu.type.versionPublished versionde
jgu.type.resourceTextde
jgu.organisation.departmentFB 08 Physik, Mathematik u. Informatikde
jgu.organisation.number7940-
jgu.organisation.nameJohannes Gutenberg-Universität Mainz-
jgu.rights.accessrightsopenAccess-
jgu.journal.titleJournal of cheminformaticsde
jgu.journal.volume8de
jgu.pages.alternativeArt. 60de
jgu.publisher.year2016-
jgu.publisher.nameBioMed Centralde
jgu.publisher.placeLondonde
jgu.publisher.urihttp://dx.doi.org/10.1186/s13321-016-0173-zde
jgu.publisher.issn1758-2946de
jgu.organisation.placeMainz-
jgu.subject.ddccode004de
opus.date.modified2018-08-22T10:09:58Z
opus.subject.dfgcode13-409
opus.organisation.stringFB 08: Physik, Mathematik und Informatik: Institut für Informatikde_DE
opus.identifier.opusid55064
opus.institute.number0805
opus.metadataonlyfalse
opus.type.contenttypeKeinede_DE
opus.type.contenttypeNoneen_EN
opus.affiliatedGütlein, Martin
opus.affiliatedKramer, Stefan
jgu.publisher.doi10.1186/s13321-016-0173-zde
jgu.organisation.rorhttps://ror.org/023b0x485-
Appears in collections:DFG-OA-Publizieren (2012 - 2017)

Files in This Item:
  File Description SizeFormat
Thumbnail
filtered_circular_fingerprint-20220714105635233.pdf3.63 MBAdobe PDFView/Open