Distributed Optimization for Overparameterized Problems: Achieving Optimal Dimension Independent Communication Complexity
Authors: Bingqing Song, Ioannis Tsaknakis, Chung-Yiu Yau, Hoi-To Wai, Mingyi Hong
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Preliminary Numerical Experiments. We conclude by presenting a numerical experiment for the UCI Tom’s Hardware dataset using Alg. 2 where we applied blog(t+100)c rounds of communication at the t-th iteration for the CHOCO-GOSSIP subroutine; see Appendix F.1. We consider a ring network with K = 5 agents, each one has 500 or 1000 samples (thus making N = 2500, or N = 5000). We construct D-dimensional features from the dataset as NTK features [Bietti and Mairal, 2019]. In Fig. 1, we train a least square regression model in the overparameterized regime. |
| Researcher Affiliation | Academia | Bingqing Song Department of ECE University of Minnesota email:song0409@umn.edu Ioannis Tsaknakis Department of ECE University of Minnesota email:tsakn001@umn.edu Chung-Yiu Yau Department of SEEM Chinese University of Hong Kong email:cyyau@se.cuhk.edu.hk Hoi-To Wai Department of SEEM Chinese University of Hong Kong email:htwai@cuhk.edu.hk Mingyi Hong Department of ECE University of Minnesota email:mhong@umn.edu |
| Pseudocode | Yes | Algorithm 1 Limited Communication Distributed Optimization Algorithm ... Algorithm 2 Decentralized Gradient Descent with Compressed Comm. via Linear Compression |
| Open Source Code | No | The paper does not provide any explicit statements about making its source code open, nor does it include a link to a code repository for the described methodology. |
| Open Datasets | Yes | Preliminary Numerical Experiments. We conclude by presenting a numerical experiment for the UCI Tom’s Hardware dataset using Alg. 2 where we applied blog(t+100)c rounds of communication at the t-th iteration for the CHOCO-GOSSIP subroutine; see Appendix F.1. We consider a ring network with K = 5 agents, each one has 500 or 1000 samples (thus making N = 2500, or N = 5000). We construct D-dimensional features from the dataset as NTK features [Bietti and Mairal, 2019]. In Fig. 1, we train a least square regression model in the overparameterized regime. Available: https://archive.ics.uci.edu/ml/datasets/Buzz+in+social+media+ |
| Dataset Splits | No | The paper mentions training a model on the dataset but does not specify details regarding train/validation/test splits or cross-validation setup for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions). |
| Experiment Setup | Yes | We consider a ring network with K = 5 agents, each one has 500 or 1000 samples (thus making N = 2500, or N = 5000). ... we applied blog(t+100)c rounds of communication at the t-th iteration for the CHOCO-GOSSIP subroutine. |