Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization
Authors: Sijia Liu, Bhavya Kailkhura, Pin-Yu Chen, Paishun Ting, Shiyu Chang, Lisa Amini
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experimental results show that our approaches outperform other state-of-the-art ZO algorithms, and strike a balance between the convergence rate and the function query complexity. |
| Researcher Affiliation | Collaboration | Sijia Liu1 Bhavya Kailkhura2 Pin-Yu Chen1 Paishun Ting3 Shiyu Chang1 Lisa Amini1 1MIT-IBM Watson AI Lab, IBM Research 2Lawrence Livermore National Laboratory 3University of Michigan, Ann Arbor |
| Pseudocode | Yes | Algorithm 1: SVRG(T, m, {ηk}, b, x0) Algorithm 2: ZO-SVRG(T, m, {ηk}, b, x0, µ) |
| Open Source Code | Yes | Code to reproduce experiments can be found at https://github.com/IBM/ZOSVRG-Black Box-Adv |
| Open Datasets | Yes | We use a well-trained DNN8 on the MNIST handwritten digit classification task as the target black-box model... The used dataset consists of N = 1000 crystalline materials/compounds extracted from Open Quantum Materials Database [33]. |
| Dataset Splits | No | The paper specifies training and testing samples but does not explicitly mention a validation split. 'We split the dataset into two equal parts, leading to n = 500 training samples and (N n) testing samples.' |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It discusses general experimental setups but not the underlying hardware. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, TensorFlow 2.x, PyTorch 1.x). |
| Experiment Setup | Yes | We choose n = 10 images from the same class, and set the same parameters b = 5 and constant step size 30/d for both ZO methods, where d = 28 28 is the image dimension. For ZO-SVRG-Ave, we set m = 10 and vary the number of random direction samples q {10, 20, 30}. |