reproducibilityindex.ai

BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization

Authors: Chen Fan, Gaspard Choné-Ducasse, Mark Schmidt, Christos Thrampoulidis

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, our extensive experiments demonstrate that the new algorithms, which are available in both SGD and Adam versions, can find large learning rates with minimal tuning and converge faster than corresponding vanilla SGD or Adam BO algorithms that require fine-tuning.
Researcher Affiliation	Academia	1University of British Columbia 2Ecole Normale Supérieure 3Canada CIFAR AI Chair (Amii)
Pseudocode	Yes	Algorithm 1 Bi SLS-Adam/SGD Algorithm 2 reset
Open Source Code	No	The paper does not contain an explicit statement or link indicating the release of open-source code for the described methodology.
Open Datasets	Yes	The experiments are performed on MNIST dataset using LeNet [26, 42]. ... Binary linear classification on w8a dataset using logistic loss [3].
Dataset Splits	Yes	Validation loss against upper-level iterations for different values of β (left, α = 0.005) and α (right, β = 0.01). ... where (X1, Y1) and (X2, Y2) are validation and training data sets with sizes DX1 and DX2, respectively;
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU or CPU models. It generally refers to computation without hardware specifics.
Software Dependencies	No	The paper mentions implicit use of frameworks common in deep learning, such as Adam and SGD variants, but it does not specify versions for any software components.
Experiment Setup	Yes	For constant-step SGD and Adam, we tune the lower-level learning rate β {10.0, 5.0, 1.0, 0.5, 0.1, 0.05, 0.01}. For the upper-level learning rate, we tune α {0.001, 0.0025, 0.005, 0.01, 0.05, 0.1} for SGD, and α {10-5, 5 10-5, 10-4, 5 10-4, 0.001, 0.01} for Adam