Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction

Authors: Wei Jiang, Sifan Yang, Wenhao Yang, Lijun Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments across different tasks validate the effectiveness of our proposed methods. In this section, we assess the performance of the proposed methods through numerical experiments. Concretely, we train a Res Net18 model [He et al., 2016] on the CIFAR-10 dataset [Krizhevsky, 2009].
Researcher Affiliation Academia Wei Jiang1, Sifan Yang1,2, Wenhao Yang1,2, Lijun Zhang1,3,2, 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artificial Intelligence, Nanjing University, Nanjing, China 3Pazhou Laboratory (Huangpu), Guangzhou, China
Pseudocode Yes Algorithm 1 SSVR
Open Source Code No Due to privacy concerns and ongoing research, we do not include the code.
Open Datasets Yes Concretely, we train a Res Net18 model [He et al., 2016] on the CIFAR-10 dataset [Krizhevsky, 2009].
Dataset Splits No For hyper-parameter tuning, we either follow the recommendations from the original papers or employ a grid search to determine the best settings. The paper does not specify the train/validation/test splits explicitly.
Hardware Specification Yes All experiments are conducted on NVIDIA 3090 GPUs.
Software Dependencies No The paper does not specify software versions for libraries or frameworks used in the experiments.
Experiment Setup Yes For hyper-parameter tuning, we either follow the recommendations from the original papers or employ a grid search to determine the best settings. Specifically, the momentum parameter β is searched from the set {0.1, 0.5, 0.9, 0.99}, and the learning rate is fine-tuned within the range of {1e 5, 1e 4, 1e 3, 1e 2, 1e 1}.