reproducibilityindex.ai

Numerical influence of ReLU’(0) on backpropagation

Authors: David Bertoin, Jérôme Bolte, Sébastien Gerchinovitz, Edouard Pauwels

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate the importance of the value of Re LU (0) for several precision levels (16, 32, 64 bits), on various networks (fully connected, VGG, Res Net) and datasets (MNIST, CIFAR10, SVHN, Image Net). We observe considerable variations of backpropagation outputs which occur around half of the time in 32 bits precision.
Researcher Affiliation	Collaboration	David Bertoin IRT Saint Exup ery ISAE-SUPAERO ANITI Toulouse, France david.bertoin@irt-saintexupery.com J erˆome Bolte Toulouse School of Economics Universit e Toulouse 1 Capitole ANITI Toulouse, France jbolte@ut-capitole.fr S ebastien Gerchinovitz IRT Saint Exup ery Institut de Math ematiques de Toulouse ANITI Toulouse, France sebastien.gerchinovitz@irt-saintexupery.com Edouard Pauwels CNRS IRIT, Universit e Paul Sabatier ANITI Toulouse, France edouard.pauwels@irit.fr
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks labeled 'Algorithm' or 'Pseudocode'.
Open Source Code	Yes	All our experiments are done using Py Torch [26]; we provide the code to generate all ﬁgures presented in this manuscript.
Open Datasets	Yes	MNIST dataset [24], CIFAR10 dataset [23], SVHN [25], and Image Net [12]
Dataset Splits	No	The paper references well-known datasets (e.g., MNIST, CIFAR10, ImageNet) but does not explicitly provide the specific percentages or sample counts for train, validation, and test splits used to reproduce the data partitioning. It mentions 'test accuracy' and 'training loss' but no detailed split information for validation.
Hardware Specification	No	The paper mentions that some experiments were run 'on a CPU' and others 'on GPU' but does not provide specific details such as CPU models, GPU models (e.g., NVIDIA A100), or other hardware specifications like memory or number of cores.
Software Dependencies	No	The paper mentions using Py Torch [26], Tensor Flow [2], Jax [10], and the optuna library [3]. However, it does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	We initialized two fully connected neural networks f0 and f1 of size 784 2000 128 10 with the same weights... with the same sequence of mini-batches (Bk)k N (minibatch size 128), using the recursion in (4) for s = 0 and s = 1 and with a ﬁxed αk = 1, and γ chosen uniformly at random in [0.01, 0.012].