Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions
Authors: Wei Jiang, Sifan Yang, Yibo Wang, Lijun Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments across various tasks validate the effectiveness of our method. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artificial Intelligence, Nanjing University, Nanjing, China |
| Pseudocode | Yes | Algorithm 1 STORM Algorithm |
| Open Source Code | No | Due to privacy concerns and ongoing research, we do not include the code. |
| Open Datasets | Yes | Specifically, we train Res Net18 and Res Net34 models [He et al., 2016] on the CIFAR-10 and CIFAR-100 datasets [Krizhevsky, 2009] respectively. |
| Dataset Splits | No | The paper does not explicitly mention the use of a separate validation dataset split, only training and testing. |
| Hardware Specification | Yes | All the experiments are conducted on eight NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions using "Pytorch [Paszke et al., 2019]" but does not specify a version number for Pytorch or any other software dependency. |
| Experiment Setup | Yes | For all optimizers, we set the batch size as 256 and train for 200 epochs. [...] The batch size is set as 20 and all methods are trained for 40 epochs with dropout rate 0.1. We also clip the gradients by norm 0.25 in case of the exploding gradient. |