An Accelerated Algorithm for Stochastic Bilevel Optimization under Unbounded Smoothness
Authors: Xiaochuan Gong, Jie Hao, Mingrui Liu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on various tasks confirm that our proposed algorithm achieves the predicted theoretical acceleration and significantly outperforms baselines in bilevel optimization. The code is available here. |
| Researcher Affiliation | Academia | Xiaochuan Gong Jie Hao Mingrui Liu Department of Computer Science George Mason University {xgong2, jhao6, mingruil}@gmu.edu |
| Pseudocode | Yes | Algorithm 1 STOCHASTIC NESTEROV ACCELERATED GRADIENT METHOD (SNAG)... Algorithm 2 ACCELERATED BILEVEL OPTIMIZATION ALGORITHM (ACCBO) |
| Open Source Code | Yes | The code is available here. |
| Open Datasets | Yes | The Deep AUC maximization experiment is performed on imbalanced Sentiment140 [31] dataset... We perform bilevel optimization algorithms on the noisy text classification dataset Stanford Natural Language Inference (SNLI) [8] |
| Dataset Splits | No | The paper mentions 'training set' and 'test set' for Sentiment140, and refers to a 'clean validation set Dval' in the problem formulation for Data Hypercleaning, but does not specify explicit percentages or counts for dataset splits (train/validation/test) used in the experiments. |
| Hardware Specification | Yes | All the experiments are run on the device of NVIDIA A6000 (48GB memory) GPU and AMD EPYC 7513 32-Core CPU. |
| Software Dependencies | No | The paper mentions using specific datasets and neural networks but does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks used for implementation. |
| Experiment Setup | Yes | Hyerparameter setting. We tune the best hyperparameters for each algorithm, including upper/lower-level step size, the number of inner loops, momentum parameters, etc. The upper-level learning rate ηup and lower-level learing rate ηlow are tuned in the range of [0.001, 0.1]... The batch size is set to be 32... The momentum parameter β is fixed to 0.9... The warm start steps for lower-level variable in Acc BO is set to 3. |