Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction
Authors: Wei Jiang, Sifan Yang, Wenhao Yang, Lijun Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments across different tasks validate the effectiveness of our proposed methods. In this section, we assess the performance of the proposed methods through numerical experiments. Concretely, we train a Res Net18 model [He et al., 2016] on the CIFAR-10 dataset [Krizhevsky, 2009]. |
| Researcher Affiliation | Academia | Wei Jiang1, Sifan Yang1,2, Wenhao Yang1,2, Lijun Zhang1,3,2, 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artificial Intelligence, Nanjing University, Nanjing, China 3Pazhou Laboratory (Huangpu), Guangzhou, China |
| Pseudocode | Yes | Algorithm 1 SSVR |
| Open Source Code | No | Due to privacy concerns and ongoing research, we do not include the code. |
| Open Datasets | Yes | Concretely, we train a Res Net18 model [He et al., 2016] on the CIFAR-10 dataset [Krizhevsky, 2009]. |
| Dataset Splits | No | For hyper-parameter tuning, we either follow the recommendations from the original papers or employ a grid search to determine the best settings. The paper does not specify the train/validation/test splits explicitly. |
| Hardware Specification | Yes | All experiments are conducted on NVIDIA 3090 GPUs. |
| Software Dependencies | No | The paper does not specify software versions for libraries or frameworks used in the experiments. |
| Experiment Setup | Yes | For hyper-parameter tuning, we either follow the recommendations from the original papers or employ a grid search to determine the best settings. Specifically, the momentum parameter β is searched from the set {0.1, 0.5, 0.9, 0.99}, and the learning rate is fine-tuned within the range of {1e 5, 1e 4, 1e 3, 1e 2, 1e 1}. |