reproducibilityindex.ai

Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth Optimization

Authors: Feihu Huang, Bin Gu, Zhouyuan Huo, Songcan Chen, Heng Huang1503-1510

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, the experimental results verify that our algorithms have a faster convergence rate than the existing zerothorder proximal stochastic algorithm.
Researcher Affiliation	Collaboration	College of Computer Science & Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China 2Department of Electrical & Computer Engineering, University of Pittsburgh, PA 15261, USA 3JDDGlobal.com
Pseudocode	Yes	Algorithm 1 ZO-Prox SVRG for Nonconvex Optimization and Algorithm 2 ZO-Prox SAGA for Nonconvex Optimization
Open Source Code	No	The paper does not provide an explicit statement about the release of source code for the described methodology or a link to a code repository.
Open Datasets	Yes	In the experiment, we use the publicly available real datasets1, which are summarized in Table 2. Footnote 1: 20news is from the website https://cs.nyu.edu/ roweis/data. html; a9a, w8a and covtype.binary are from the website www.csie. ntu.edu.tw/ cjlin/libsvmtools/datasets/.
Dataset Splits	Yes	For each dataset, we use half of the samples as training data, and the rest as testing data.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup	Yes	In the algorithms, we ﬁx the mini-batch size b = 20, the smoothing parameters µ = 1/dt in the Gau SGE and µ = 1/d√t in the Goo SGE. Meanwhile, we ﬁx λ1 = λ2 = 10−5, and use the same initial solution x0 from the standard normal distribution in each experiment. In the experiment, we select n = 10 examples from the same class, and set the batch size b = 5 and a constant step size η = 1/d for the zeroth-order algorithms, where d = 28 * 28. In addition, we set λ1 = 10−3 and λ2 = 1 in the experiment.