reproducibilityindex.ai

Symmetries in Overparametrized Neural Networks: A Mean Field View

Authors: Javier Maass, Joaquin Fontbona

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the validity of our findings as N gets larger, in a teacher-student experimental setting, training a student NN to learn from a WI, SI or arbitrary teacher model through various SL schemes.
Researcher Affiliation	Academia	Javier Maass Martínez Center for Mathematical Modeling University of Chile javier.maass@gmail.com Joaquín Fontbona Center for Mathematical Modeling University of Chile fontbona@dim.uchile.cl
Pseudocode	No	The paper describes the SGD training dynamics with an equation (1) and (5), but it does not include a dedicated pseudocode or algorithm block.
Open Source Code	Yes	Code necessary for replicating the obtained results, as well as a detailed description of our experimental setting, can be sought in the Supp Mat.
Open Datasets	No	We consider synthetic data produced in a teacher-student setting... Our data distribution π will be such that (X, Y ) π will satisfy X N(0, σ2 π.Id2) (with σπ = 4), and Y = f (X).
Dataset Splits	No	The paper describes the training process in a teacher-student setting, specifying aspects like epochs and minibatch SGD, but it does not define explicit train/validation/test dataset splits with percentages or sample counts, as it uses synthetic data.
Hardware Specification	Yes	All the different experiments were run on Python 3.10, on a Google Colab session consisting (by default) of 2 Intel Xeon virtual CPUs (2.20GHz) and with 13GB of RAM.
Software Dependencies	No	The paper mentions 'Python 3.10', 'objax default SGD training', 'pytorch and jax', and the 'emlp repository'. While Python 3.10 has a version, specific versions are not provided for the other libraries like objax, pytorch, jax, or emlp.
Experiment Setup	Yes	The training parameters were fixed to be (unless explicitly stated otherwise): Step Size: ς α > 0 (with α = 50 in most experiments), εN = 1 N , so that s N k = α N . Regularization parameters: τ = 10 4 and β = 10 6. Batch Size: It was chosen to be B = 20. Number of Training Epochs: ... Ne = N T epochs (iterations).