reproducibilityindex.ai

Neuron birth-death dynamics accelerates gradient descent and converges asymptotically

Authors: Grant Rotskoff, Samy Jelassi, Joan Bruna, Eric Vanden-Eijnden

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We implement this non-local dynamics as a stochastic neuronal birth-death process and we prove that it accelerates the rate of convergence in the meanﬁeld limit. We subsequently realize this PDE with two classes of numerical schemes that converge to the mean-ﬁeld equation, each of which can easily be implemented for neural networks with ﬁnite numbers of units. We illustrate our algorithms with two models to provide intuition for the mechanism through which convergence is accelerated.
Researcher Affiliation	Academia	1Courant Institute, New York University, New York, USA 2Center for Data Science, New York University, New York, USA 3Princeton University, Princeton, New Jersey, USA.
Pseudocode	Yes	Algorithm 1 Parameter birth-death dynamics consistent with (13)
Open Source Code	Yes	In Fig. 6, we show convergence to the energy minimizer for a mixture of three Gaussians (details and source code are provided in the SM).
Open Datasets	No	The paper uses illustrative examples like 'Mixture of Gaussians' and 'Student-Teacher ReLU Network' but does not provide concrete access information (link, DOI, formal citation) for a publicly available dataset. The datasets appear to be constructed or simulated for the experiments.
Dataset Splits	No	The paper does not explicitly provide specific training, validation, and test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or specific computing environments.
Software Dependencies	No	The paper mentions 'implementations in Py Torch' but does not specify version numbers for PyTorch or any other software dependencies, which would be required for reproducible ancillary software details.
Experiment Setup	No	The paper mentions training with 'stochastic gradient descent (SGD)' and using 'mini-batch estimate'. However, it lacks specific details on hyperparameters such as learning rate, batch size, number of epochs, or other optimizer settings, which are crucial for reproducing the experimental setup.