reproducibilityindex.ai

General Stability Analysis for Zeroth-Order Optimization Algorithms

Authors: Xinyue Liu, Hualin Zhang, Bin Gu, Hong Chen

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	NUMERICAL EXPERIMENTS In this section, we assess the generalization errors associated with optimizing nonconvex loss functions using ZO-GD, ZO-SGD, and ZO-SVRG. To primary goal is to verify the generalization errors of different zeroth-order optimization algorithms and different gradient estimators. To achieve this, we conduct experiments on two nonconvex models: nonconvex logistic regression and a two layer neural network.
Researcher Affiliation	Academia	1College of Informatics, Huazhong Agricultural University, China 2Engineering Research Center of Intelligent Technology for Agriculture, China 3School of Artificial Intelligence, Jilin University, China 4Mohamed bin Zayed University of Artificial Intelligence, UAE
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	No	The paper does not provide an explicit statement about concrete access to source code for the methodology described.
Open Datasets	Yes	For both two nonconvex models, we utilize the LIBSVM s Australian dataset.
Dataset Splits	Yes	We separate the dataset into two parts: 80% for training and 20% for test.
Hardware Specification	No	No specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running the experiments are provided.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	For all experiments, we set the maximum number of iterations to be 2000. The batch size of the stochastic gradient is set to be 50. The initial learning rate is set to 0.01. This rate is systematically decreased every T iterations by a factor of γ. Both T and γ are optimally determined through a grid search process where T and γ are chosen from {30, 60, 100, 150, 200, 250} and {0.6, 0.7, 0.8, 0.9}, respectively. For 2-point gradient estimator, we also conduct a grid search for the parameter K chosen from the set {2, 3, 4, 6, 8, 9, 12}.