Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization

Authors: Sijia Liu, Bhavya Kailkhura, Pin-Yu Chen, Paishun Ting, Shiyu Chang, Lisa Amini

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experimental results show that our approaches outperform other state-of-the-art ZO algorithms, and strike a balance between the convergence rate and the function query complexity.
Researcher Affiliation Collaboration Sijia Liu1 Bhavya Kailkhura2 Pin-Yu Chen1 Paishun Ting3 Shiyu Chang1 Lisa Amini1 1MIT-IBM Watson AI Lab, IBM Research 2Lawrence Livermore National Laboratory 3University of Michigan, Ann Arbor
Pseudocode Yes Algorithm 1: SVRG(T, m, {ηk}, b, x0) Algorithm 2: ZO-SVRG(T, m, {ηk}, b, x0, µ)
Open Source Code Yes Code to reproduce experiments can be found at https://github.com/IBM/ZOSVRG-Black Box-Adv
Open Datasets Yes We use a well-trained DNN8 on the MNIST handwritten digit classification task as the target black-box model... The used dataset consists of N = 1000 crystalline materials/compounds extracted from Open Quantum Materials Database [33].
Dataset Splits No The paper specifies training and testing samples but does not explicitly mention a validation split. 'We split the dataset into two equal parts, leading to n = 500 training samples and (N n) testing samples.'
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It discusses general experimental setups but not the underlying hardware.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, TensorFlow 2.x, PyTorch 1.x).
Experiment Setup Yes We choose n = 10 images from the same class, and set the same parameters b = 5 and constant step size 30/d for both ZO methods, where d = 28 28 is the image dimension. For ZO-SVRG-Ave, we set m = 10 and vary the number of random direction samples q {10, 20, 30}.