reproducibilityindex.ai

On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)

Authors: Zhiyuan Li, Sadhika Malladi, Sanjeev Arora

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical testing showing that the trajectory under SVAG converges and closely follows SGD, suggesting (in combination with the previous result) that the SDE approximation can be a meaningful approach to understanding the implicit bias of SGD in deep learning. We train Pre Res Net32 with BN on CIFAR-10 for 300 epochs, decaying by 0.1 at epoch 250.
Researcher Affiliation	Academia	Zhiyuan Li Sadhika Malladi Sanjeev Arora Princeton University {zhiyuanli,smalladi,arora}@cs.princeton.edu
Pseudocode	No	The paper describes the SVAG algorithm using mathematical equations and descriptive text, but it does not provide a clearly labeled pseudocode block or algorithm block.
Open Source Code	Yes	We provide our code at https://github.com/sadhikamalladi/svag.
Open Datasets	Yes	SGD with batch size 125 and NGD with matching covariance have close train and test curves when training on CIFAR-10. We train Pre Res Net32 with BN on CIFAR-10 for 300 epochs, decaying by 0.1 at epoch 250.
Dataset Splits	No	The paper mentions 'train' and 'test' curves/accuracy but does not specify any validation dataset splits or methodology.
Hardware Specification	No	The paper does not provide specific details about the hardware used for the experiments (e.g., GPU models, CPU types, memory). It only discusses the experimental setup at a higher level.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	All three settings use the same LR schedule, LR= 0.8 initially and is decayed by 0.1 at epoch 250 with 300 epochs total budget. SGD with batch size 125 and NGD with matching covariance have close train and test curves when training on CIFAR-10.