On sampling error in genetic programming

dc.contributor.authorSchweim, Dirk
dc.contributor.authorWittenberg, David
dc.contributor.authorRothlauf, Franz
dc.date.accessioned2022-05-03T09:45:51Z
dc.date.available2022-05-03T09:45:51Z
dc.date.issued2021
dc.description.abstractThe initial population in genetic programming (GP) should form a representative sample of all possible solutions (the search space). While large populations accurately approximate the distribution of possible solutions, small populations tend to incorporate a sampling error. This paper analyzes how the size of a GP population affects the sampling error and contributes to answering the question of how to size initial GP populations. First, we present a probabilistic model of the expected number of subtrees for GP populations initialized with full, grow, or ramped half-and-half. Second, based on our frequency model, we present a model that estimates the sampling error for a given GP population size. We validate our models empirically and show that, compared to smaller population sizes, our recommended population sizes largely reduce the sampling error of measured fitness values. Increasing the population sizes even more, however, does not considerably reduce the sampling error of fitness values. Last, we recommend population sizes for some widely used benchmark problem instances that result in a low sampling error. A low sampling error at initialization is necessary (but not sufficient) for a reliable search since lowering the sampling error means that the overall random variations in a random sample are reduced. Our results indicate that sampling error is a severe problem for GP, making large initial population sizes necessary to obtain a low sampling error. Our model allows practitioners of GP to determine a minimum initial population size so that the sampling error is lower than a threshold, given a confidence level.en_GB
dc.identifier.doihttp://doi.org/10.25358/openscience-5820
dc.identifier.urihttps://openscience.ub.uni-mainz.de/handle/20.500.12030/5829
dc.language.isoengde
dc.rightsCC-BY-4.0*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subject.ddc330 Wirtschaftde_DE
dc.subject.ddc330 Economicsen_GB
dc.titleOn sampling error in genetic programmingen_GB
dc.typeZeitschriftenaufsatzde
jgu.journal.titleNatural computingde
jgu.journal.volume2021de
jgu.organisation.departmentFB 03 Rechts- und Wirtschaftswissenschaftende
jgu.organisation.nameJohannes Gutenberg-Universität Mainz
jgu.organisation.number2300
jgu.organisation.placeMainz
jgu.organisation.rorhttps://ror.org/023b0x485
jgu.publisher.doi10.1007/s11047-020-09828-wde
jgu.publisher.issn1572-9796de
jgu.publisher.nameSpringer Science + Business Media B.V.de
jgu.publisher.placeDordrechtde
jgu.publisher.year2021
jgu.rights.accessrightsopenAccess
jgu.subject.ddccode330de
jgu.type.dinitypeArticleen_GB
jgu.type.resourceTextde
jgu.type.versionPublished versionde

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
on_sampling_error_in_genetic_-20220503114624306.pdf
Size:
900.06 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.57 KB
Format:
Item-specific license agreed upon to submission
Description: