Studying the evolution of neural activation patterns during training of feed-forward ReLU networks

dc.contributor.authorHartmann, David
dc.contributor.authorFranzen, Daniel
dc.contributor.authorBrodehl, Sebastian
dc.date.accessioned2022-03-24T11:00:09Z
dc.date.available2022-03-24T11:00:09Z
dc.date.issued2021
dc.description.abstractThe ability of deep neural networks to form powerful emergent representations of complex statistical patterns in data is as remarkable as imperfectly understood. For deep ReLU networks, these are encoded in the mixed discrete–continuous structure of linear weight matrices and non-linear binary activations. Our article develops a new technique for instrumenting such networks to efficiently record activation statistics, such as information content (entropy) and similarity of patterns, in real-world training runs. We then study the evolution of activation patterns during training for networks of different architecture using different training and initialization strategies. As a result, we see characteristic- and general-related as well as architecture-related behavioral patterns: in particular, most architectures form bottom-up structure, with the exception of highly tuned state-of-the-art architectures and methods (PyramidNet and FixUp), where layers appear to converge more simultaneously. We also observe intermediate dips in entropy in conventional CNNs that are not visible in residual networks. A reference implementation is provided under a free license1.en_GB
dc.description.sponsorshipOpen Access-Publizieren Universität Mainz / Universitätsmedizin Mainzde
dc.identifier.doihttp://doi.org/10.25358/openscience-6840
dc.identifier.urihttps://openscience.ub.uni-mainz.de/handle/20.500.12030/6851
dc.language.isoengde
dc.rightsCC-BY-4.0*
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/*
dc.subject.ddc004 Informatikde_DE
dc.subject.ddc004 Data processingen_GB
dc.titleStudying the evolution of neural activation patterns during training of feed-forward ReLU networksen_GB
dc.typeZeitschriftenaufsatzde
jgu.journal.titleFrontiers in artificial intelligencede
jgu.journal.volume4de
jgu.organisation.departmentFB 08 Physik, Mathematik u. Informatikde
jgu.organisation.nameJohannes Gutenberg-Universität Mainz
jgu.organisation.number7940
jgu.organisation.placeMainz
jgu.organisation.rorhttps://ror.org/023b0x485
jgu.pages.alternative642374de
jgu.publisher.doi10.3389/frai.2021.642374de
jgu.publisher.issn2624-8212de
jgu.publisher.nameFrontiers Mediade
jgu.publisher.placeLausannede
jgu.publisher.year2021
jgu.rights.accessrightsopenAccess
jgu.subject.ddccode004de
jgu.type.dinitypeArticleen_GB
jgu.type.resourceTextde
jgu.type.versionPublished versionde

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
studying_the_evolution_of_neu-20220322111055686.pdf
Size:
2.69 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.57 KB
Format:
Item-specific license agreed upon to submission
Description: