reproducibilityindex.ai

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Authors: Zhize Li, Jian Li

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct several experiments and the experimental results are consistent with the theoretical results. In this section, we present the experimental results. We compare the nonconvex Prox SVRG+ with nonconvex Prox GD, Prox SGD [10], Prox SVRG [24]. We conduct the experiments using the nonnegative principal component analysis (NN-PCA) problem (same as [24]). The experimental results on both datasets (corresponding to the ﬁrst row and second row in Figure 3 5) are almost the same.
Researcher Affiliation	Academia	Zhize Li IIIS, Tsinghua University zz-li14@mails.tsinghua.edu.cn Jian Li IIIS, Tsinghua University lijian83@mail.tsinghua.edu.cn
Pseudocode	Yes	Algorithm 1 Nonconvex Prox SVRG+
Open Source Code	No	The paper does not contain any explicit statements about releasing source code for the described methodology or a direct link to a code repository.
Open Datasets	Yes	We conduct the experiment on the standard MNIST and a9a datasets. The datasets can be downloaded from https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Dataset Splits	No	The paper mentions using the 'standard MNIST and a9a datasets' but does not provide specific details on how these datasets were split into training, validation, and testing sets, nor does it refer to predefined splits with citations.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or cloud instance types) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or versions (e.g., library names with version numbers like PyTorch 1.9, TensorFlow 2.x) needed to replicate the experiment.
Experiment Setup	Yes	The step sizes η for different algorithms are set to be the ones used in their convergence results: For Prox GD, it is η = 1/L (see Corollary 1 in [10]); for Prox SGD, η = 1/(2L) (see Corollary 3 in [10]); for Prox SVRG, η = b3/2/(3Ln) (see Theorem 6 in [24]). The step size for our Prox SVRG+ is 1/(6L) (see our Theorem 1). The batch size B (in Line 4 of Algorithm 1) is equal to n/5 (i.e., 20% data samples).