reproducibilityindex.ai

Learning to Learn by Zeroth-Order Oracle

Authors: Yangjun Ruan, Yuanhao Xiong, Sashank Reddi, Sanjiv Kumar, Cho-Jui Hsieh

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our learned optimizer outperforms hand-designed algorithms in terms of convergence rate and ﬁnal solution on both synthetic and practical ZO optimization problems (in particular, the black-box adversarial attack task, which is one of the most widely used applications of ZO optimization). We ﬁnally conduct extensive analytical experiments to demonstrate the effectiveness of our proposed optimizer.1
Researcher Affiliation	Collaboration	Yangjun Ruan1, Yuanhao Xiong2, Sashank Reddi3, Sanjiv Kumar3, Cho-Jui Hsieh2,3 1Department of Infomation Science and Electrical Engineering, Zhejiang University 2Department of Computer Science, UCLA 3Google Research
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at https://github.com/RYoung J/ZO-L2L
Open Datasets	Yes	We follow the same neural network architectures used in Cheng et al. (2019) for MNIST and CIFAR-10 dataset
Dataset Splits	No	The paper mentions training and testing splits for the optimizer's training (e.g., 'randomly select 100 images... to train the optimizer and select another 100 images to test the learned optimizer'), but it does not specify a separate validation set for hyperparameter tuning during this process.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For each task, we tune the hyperparameters of baseline algorithms to report the best performance. Speciﬁcally, we set the learning rate of baseline algorithms to δ/d. We ﬁrst coarsely tune the constant δ on a logarithmic range {0.01, 0.1, 1, 10, 100, 1000} and then ﬁnetune it on a linear range. For ZO-ADAM, we tune β1 values over {0.9, 0.99} and β2 values over {0.99, 0.996, 0.999}. In all experiments, we use 1-layer LSTM with 10 hidden units for both the Update RNN and the Query RNN. For each RNN, another linear layer is applied to project the hidden state to the output (1-dim parameter update for the Update RNN and 1-dim predicted variance for the Query RNN). The regularization parameter λ in the training objective function (equation 4) is set to 0.005. We use ADAM to train our proposed optimizer with truncated BPTT, each optimization is run for 200 steps and unrolled for 20 steps unless speciﬁed otherwise. At test time, we set the Bernoulli random variable (see Section 3.1) X Ber(0.5).