reproducibilityindex.ai

Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models

Authors: Leonardo Galli, Holger Rauhut, Mark Schmidt

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that nonmonotone methods improve the speed of convergence and generalization properties of SGD/Adam even beyond the previous monotone line searches.
Researcher Affiliation	Academia	Leonardo Galli, Holger Rauhut RWTH Aachen University Aachen {galli, rauhut}@mathc.rwth-aachen.de Mark Schmidt University of British Columbia Canada CIFAR AI Chair (Amii) schmidtm@cs.ubc.ca
Pseudocode	No	The paper describes the proposed methods and equations in the main text, but it does not include a distinct pseudocode block or algorithm listing.
Open Source Code	No	The paper does not provide any explicit statements about the release of its source code or links to a code repository.
Open Datasets	Yes	In particular, we focus on the datasets MNIST, Fashion MNIST, CIFAR10, CIFAR100 and SVHN, addressed with the architectures MLP [Luo et al., 219], Efficient Net-b1 [Tan and Le, 2019], Res Net-34 [He et al., 2016], Dense Net-121 [Huang et al., 2017] and Wide Res Net [Zagoruyko and Komodakis, 2016].
Dataset Splits	No	The paper mentions 'train loss' and 'test accuracy' and refers to standard datasets, but it does not explicitly provide details on how the datasets were split into training, validation, and test sets (e.g., percentages, specific split files, or citations to standard split methodologies for all datasets).
Hardware Specification	No	The paper does not provide specific details on the hardware used for experiments (e.g., CPU or GPU models, memory specifications, or cluster configurations).
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used in the implementation of the experiments (e.g., 'Python 3.8, PyTorch 1.9').
Experiment Setup	No	The paper mentions that 'the learning rate of SGD and Adam has been chosen through a grid-search' and refers to 'implementation details (Section C)' in the supplementary materials for hyperparameter sensitivity, but it does not provide the specific hyperparameter values or detailed training configurations in the main text.