reproducibilityindex.ai

Bad Global Minima Exist and SGD Can Reach Them

Authors: Shengchao Liu, Dimitris Papailiopoulos, Dimitris Achlioptas

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ran experiments on the CIFAR [21] dataset (including CIFAR10 and CIFAR100), CINIC10 [22] and a resized Restricted Image Net [23]. In Section 3, we show that the phenomenon sketched above persists in state-of-the-art neural network architectures over real datasets. Speciﬁcally, we examine VGG16, Res Net18, Res Net50, and Dense Net40, trained on CIFAR, CINIC10, and a restricted version of Image Net.
Researcher Affiliation	Academia	Shengchao Liu Quebec Artiﬁcial Intelligence Institute (Mila) Université de Montréal liusheng@mila.quebec; Dimitris Papailiopoulos University of Wisconsin-Madison dimitris@papail.io; Dimitris Achlioptas University of Athens optas@di.uoa.gr
Pseudocode	Yes	Algorithm 1 Adversarial initialization
Open Source Code	Yes	Our ﬁgures, models, and all results can be reproduced using the code available at an anonymous Git Hub repository: https://github.com/chao1224/Bad Global Minima.
Open Datasets	Yes	We ran experiments on the CIFAR [21] dataset (including CIFAR10 and CIFAR100), CINIC10 [22] and a resized Restricted Image Net [23].
Dataset Splits	No	The paper explicitly states the size of training and test sets for each dataset (e.g., 'The CIFAR training set consists of 50k data points and the test set consists of 10k data points'). However, it does not provide specific details on how a validation set was created or its size, nor does it mention a specific split methodology for validation.
Hardware Specification	No	The paper does not specify the hardware used for experiments, such as exact GPU/CPU models, memory, or specific cloud computing instances with their specifications.
Software Dependencies	Yes	We run our experiments on Py Torch 0.3.
Experiment Setup	Yes	Hyperparameters We apply well-tuned hyperparameters for each model and dataset. For CIFAR, CINIC10, and Restricted Image Net, we use batch size 128, while the momentum term is set to 0.9 when it is used. When we use ℓ2 regularization, the regularization parameter is 5 × 10−4 for CIFAR and Restricted Image Net and 10−4 for CINIC10. We use the following learning rate schedule for CIFAR: 0.1 for epochs 1 to 150, 0.01 for epoch 151 to 250, and 0.001 for epochs 251 to 350. We use the following learning rate schedules for CINIC10 and Restricted Image Net: 0.1 for epochs 1 to 150, 0.01 for epoch 151 to 225, and 0.001 for epochs 226 to 300.