Second Order Optimality in Decentralized Non-Convex Optimization via Perturbed Gradient Tracking
Authors: Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Numerical Experiments In this section, we compare PDGT with a simple version of D-GET where each node has full knowledge of its local gradient. ... In Fig. 1 the experiment is run for 10 nodes, and the target rank is 20. ... In Fig. 2, the experiment is run for 30 nodes and the target rank is 30. |
| Researcher Affiliation | Academia | Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari Department of Electrical and Computer Engineering The University of Texas at Austin {isidoros_13,constantine,mokhtari}@utexas.edu |
| Pseudocode | Yes | Algorithm 1: PDGT algorithm, Algorithm 2: PDGT algorithm: Phase I, Algorithm 3: PDGT algorithm: Phase II |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology is provided or available. |
| Open Datasets | Yes | We focus on a matrix factorization problem for the Movie Lens dataset, where the goal is to find a rank r approximation of a matrix M Ml n, representing the ratings from 943 users to 1682 movies. |
| Dataset Splits | No | The paper does not explicitly specify training, validation, or test dataset splits (e.g., percentages, sample counts, or specific splitting methodology). |
| Hardware Specification | No | The paper does not specify the hardware (e.g., CPU, GPU models, or cloud resources) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | The stepsize for D-GET and both phases of PDGT is 3. Regarding the parameters of PDGT we set the number of rounds during phase I and II to be 1500 and 100, respectively. Further, we set the threshold before we add noise during phase I as presented in (8) to be 10^-6 and the radius of the noise injected to be 4. |