Stochastic Nested Variance Reduction for Nonconvex Optimization
Authors: Dongruo Zhou, Pan Xu, Quanquan Gu
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we compare our algorithm SNVRG with other baseline algorithms on training a convolutional neural network for image classification. We plotted the training loss and test error for different algorithms on each dataset in Figure 3. |
| Researcher Affiliation | Academia | Dongruo Zhou Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 drzhou@cs.ucla.edu Pan Xu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 panxu@cs.ucla.edu Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 qgu@cs.ucla.edu |
| Pseudocode | Yes | Algorithm 1 One-epoch-SNVRG(x0, F, K, M, {Tl}, {Bl}, B), Algorithm 2 SNVRG, Algorithm 3 SNVRG-PL |
| Open Source Code | No | The paper mentions the implementation environment ('All algorithm are implemented in Pytorch platform version 0.4.0 within Python 3.6.4.') but does not provide any statement about releasing the source code or a link to a repository for the described methodology. |
| Open Datasets | Yes | We use three image datasets: (1) The MNIST dataset [42] consists of handwritten digits and has 50, 000 training examples and 10, 000 test examples. (2) CIFAR10 dataset [22] consists of images in 10 classes and has 50, 000 training examples and 10, 000 test examples. (3) SVHN dataset [33] consists of images of digits and has 531, 131 training examples and 26, 032 test examples. |
| Dataset Splits | No | The paper provides training and test set sizes for the datasets (e.g., '50,000 training examples and 10,000 test examples' for MNIST), but does not explicitly mention a validation split or details for reproducing such a split. |
| Hardware Specification | Yes | All experiments are conducted on Amazon AWS p2.xlarge servers which comes with Intel Xeon E5 CPU and NVIDIA Tesla K80 GPU (12G GPU RAM). |
| Software Dependencies | Yes | All algorithm are implemented in Pytorch platform version 0.4.0 within Python 3.6.4. |
| Experiment Setup | Yes | For SGD, we search the batch size from {256, 512, 1024, 2048} and the initial step sizes from {1, 0.1, 0.01}. Following the convention of deep learning practice, we apply learning rate decay schedule to each algorithm with the learning rate decayed by 0.1 every 20 epochs. |