Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Local SGD Converges Fast and Communicates Little

Authors: Sebastian U. Stich

ICLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we show some numerical experiments to illustrate the results of Theorem 2.2. ... Experimental. We examine the practical speedup on a logistic regression problem, f(x) = 1 n Pn i=1 log(1 + exp( bia i x)) + λ 2 x 2, where ai Rd and bi { 1, +1} are the data samples. The regularization parameter is set to λ = 1/n. We consider the w8a dataset (Platt, 1999) (d = 300, n = 49749). We initialize all runs with x0 = 0d and measure the number of iterations to reach the target accuracy ϵ. ... We depict the results in Figure 3, again under the assumption ρ = 25.
Researcher Affiliation Academia Sebastian U. Stich EPFL, Switzerland EMAIL
Pseudocode Yes Algorithm 1 LOCAL SGD ... Algorithm 2 ASYNCHRONOUS LOCAL SGD (SCHEMATIC)
Open Source Code No The paper mentions other open-source frameworks used in distributed deep learning but does not provide any explicit statement or link for the open-source code of its own proposed methodology.
Open Datasets Yes We consider the w8a dataset (Platt, 1999) (d = 300, n = 49749).
Dataset Splits No The paper does not specify explicit training, validation, or test splits for the dataset used. It mentions reaching a 'target accuracy' as a stopping criterion, but no detailed split information.
Hardware Specification Yes For completeness, we report that all experiments were run on an an Ubuntu 16.04 machine with a 24 cores processor Intel R Xeon R CPU E5-2680 v3 @ 2.50GHz.
Software Dependencies No The paper mentions the operating system ('Ubuntu 16.04') but does not specify any programming languages, libraries, or frameworks with their version numbers that were used for implementing the experiments.
Experiment Setup Yes We initialize all runs with x0 = 0d and measure the number of iterations to reach the target accuracy ϵ. ... By extensive grid search we determine for each configuration (H, K, B) the best stepsize from the set {min(32, cn t+1), 32c}, where c can take the values c = 2i for i Z. ... The regularization parameter is set to λ = 1/n.