An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness

Authors: Xiaochuan Gong, Jie Hao, Mingrui Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on various tasks confirm that our proposed algorithm achieves the predicted theoretical acceleration and significantly outperforms baselines in bilevel optimization. The code is available here.
Researcher Affiliation Academia Xiaochuan Gong Jie Hao Mingrui Liu Department of Computer Science George Mason University {xgong2, jhao6, mingruil}@gmu.edu
Pseudocode Yes Algorithm 1 STOCHASTIC NESTEROV ACCELERATED GRADIENT METHOD (SNAG)... Algorithm 2 ACCELERATED BILEVEL OPTIMIZATION ALGORITHM (ACCBO)
Open Source Code Yes The code is available here.
Open Datasets Yes The Deep AUC maximization experiment is performed on imbalanced Sentiment140 [31] dataset... We perform bilevel optimization algorithms on the noisy text classification dataset Stanford Natural Language Inference (SNLI) [8]
Dataset Splits No The paper mentions 'training set' and 'test set' for Sentiment140, and refers to a 'clean validation set Dval' in the problem formulation for Data Hypercleaning, but does not specify explicit percentages or counts for dataset splits (train/validation/test) used in the experiments.
Hardware Specification Yes All the experiments are run on the device of NVIDIA A6000 (48GB memory) GPU and AMD EPYC 7513 32-Core CPU.
Software Dependencies No The paper mentions using specific datasets and neural networks but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks used for implementation.
Experiment Setup Yes Hyerparameter setting. We tune the best hyperparameters for each algorithm, including upper/lower-level step size, the number of inner loops, momentum parameters, etc. The upper-level learning rate ηup and lower-level learing rate ηlow are tuned in the range of [0.001, 0.1]... The batch size is set to be 32... The momentum parameter β is fixed to 0.9... The warm start steps for lower-level variable in Acc BO is set to 3.