reproducibilityindex.ai

SGD Can Converge to Local Maxima

Authors: Liu Ziyin, Botao Li, James B Simon, Masahito Ueda

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also realize results in a minimal neural network-like example. In Sec. 6, we present the numerical simulations, including a minimal example involving a neural network.
Researcher Affiliation	Academia	1The University of Tokyo 2ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris 3University of California, Berkeley
Pseudocode	No	The paper defines algorithms (e.g., SGD and AMSGrad) using mathematical equations (e.g.,
Open Source Code	No	The paper does not provide any statement about releasing source code or links to a code repository.
Open Datasets	No	The paper uses
Dataset Splits	No	The paper does not specify explicit training/validation/test dataset splits. For the toy neural network example, it mentions
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running experiments (e.g., specific GPU/CPU models, memory, or cloud computing instances).
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	In this numerical example, we set λ = 0.8 and a = −1... we set λ = 0.2 and β2 = 0.999 for both Adam and AMSGrad. When momentum is used, we set β1 = 0.9. GD is run with a learning rate of 0.01. ...w1 is initialized uniformly in [−1,1]; w2 is initialized uniformly in [0,1]... at a small learning rate (λ = 0.001)... when the learning rate is large (λ = 0.1).