Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function
Authors: Elissa Mhanna, Mohamad Assaad
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, a numerical example validates our theoretical results. |
| Researcher Affiliation | Academia | 1Universit e Paris-Saclay, CNRS, Centrale Sup elec, Laboratoire des signaux et syst emes, 91190, Gif-sur Yvette, France. |
| Pseudocode | Yes | Algorithm 1 The 1P-DSGT-NC Algorithm |
| Open Source Code | No | The paper does not contain any statement about making its source code publicly available or provide a link to a code repository. |
| Open Datasets | Yes | We aim to classify m images of two digits taken from the MNIST data set (Le Cun & Cortes, 2005) using logistic regression. |
| Dataset Splits | No | The paper mentions splitting the dataset among agents (e.g., 'm = 12183 images in total and divided equally over n = 31 agents') and an 'independent test set', but it does not explicitly provide details about training, validation, and test splits (e.g., percentages or sample counts for each). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, or cloud computing specifications). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | The querying noise is ζi,k N(0, 1), i N, the stochastic variable s standard deviation is σu = 0.01, the regularization constant is c = 0.1, the step sizes are ηk = 1.5(k + 1) 0.51 and γk = 3.5(k + 1) 0.17, and every dimension of the perturbation vector zk is chosen from { 1 d} with equal probability. For the DSGT-NC algorithm, the step size is ηk = 2.5(k + 1) 0.51, and no other noise than that on the exact gradient is considered. Both algorithms are initialized with the same random weights vectors θi,0 U([ 1, 1]d), i N, per simulation instance. |