Zeroth-order (Non)-Convex Stochastic Optimization via Conditional Gradient and Gradient Updates
Authors: Krishnakumar Balasubramanian, Saeed Ghadimi
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we propose and analyze zeroth-order stochastic approximation algorithms for nonconvex and convex optimization. Specifically, we propose generalizations of the conditional gradient algorithm achieving rates similar to the standard stochastic gradient algorithm using only zeroth-order information. Furthermore, under a structural sparsity assumption, we first illustrate an implicit regularization phenomenon where the standard stochastic gradient algorithm with zeroth-order information adapts to the sparsity of the problem at hand by just varying the stepsize. Next, we propose a truncated stochastic gradient algorithm with zeroth-order information, whose rate depends only poly-logarithmically on the dimensionality. Our contributions extend the applicability of zeroth-order stochastic optimization to the constrained and high-dimensional setting and also provide theoretical insights in the form of rates of convergence. A summary of the results is provided in Table 1. |
| Researcher Affiliation | Academia | Krishnakumar Balasubramanian Department of Statistics University of California, Davis kbala@ucdavis.edu Saeed Ghadimi Department of Operations Research and Financial Engineering Princeton University sghadimi@princeton.edu |
| Pseudocode | Yes | Algorithm 1 Zeroth-order Stochastic Conditional Gradient Method, Algorithm 2 Inexact Conditional Gradient (ICG) method, Algorithm 3 Zeroth-order Stochastic Accelerated Gradient Method with Inexact Updates, Algorithm 4 Zeroth-Order Stochastic Gradient Method, Algorithm 5 Truncated Zeroth-Order Stochastic Gradient Method |
| Open Source Code | No | The paper does not provide any explicit statement or link regarding the release of source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not mention the use of any datasets, public or otherwise, for training or analysis. |
| Dataset Splits | No | The paper is theoretical and does not mention any dataset splits for validation or other purposes. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies or version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup, hyperparameters, or system-level training settings. |