Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions

Authors: Wei Jiang, Sifan Yang, Yibo Wang, Lijun Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments across various tasks validate the effectiveness of our method.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2School of Artificial Intelligence, Nanjing University, Nanjing, China
Pseudocode Yes Algorithm 1 STORM Algorithm
Open Source Code No Due to privacy concerns and ongoing research, we do not include the code.
Open Datasets Yes Specifically, we train Res Net18 and Res Net34 models [He et al., 2016] on the CIFAR-10 and CIFAR-100 datasets [Krizhevsky, 2009] respectively.
Dataset Splits No The paper does not explicitly mention the use of a separate validation dataset split, only training and testing.
Hardware Specification Yes All the experiments are conducted on eight NVIDIA Tesla V100 GPUs.
Software Dependencies No The paper mentions using "Pytorch [Paszke et al., 2019]" but does not specify a version number for Pytorch or any other software dependency.
Experiment Setup Yes For all optimizers, we set the batch size as 256 and train for 200 epochs. [...] The batch size is set as 20 and all methods are trained for 40 epochs with dropout rate 0.1. We also clip the gradients by norm 0.25 in case of the exploding gradient.