reproducibilityindex.ai

Local SGD Converges Fast and Communicates Little

Authors: Sebastian U. Stich

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we show some numerical experiments to illustrate the results of Theorem 2.2. ... Experimental. We examine the practical speedup on a logistic regression problem, f(x) = 1 n Pn i=1 log(1 + exp( bia i x)) + λ 2 x 2, where ai Rd and bi { 1, +1} are the data samples. The regularization parameter is set to λ = 1/n. We consider the w8a dataset (Platt, 1999) (d = 300, n = 49749). We initialize all runs with x0 = 0d and measure the number of iterations to reach the target accuracy ϵ. ... We depict the results in Figure 3, again under the assumption ρ = 25.
Researcher Affiliation	Academia	Sebastian U. Stich EPFL, Switzerland sebastian.stich@epfl.ch
Pseudocode	Yes	Algorithm 1 LOCAL SGD ... Algorithm 2 ASYNCHRONOUS LOCAL SGD (SCHEMATIC)
Open Source Code	No	The paper mentions other open-source frameworks used in distributed deep learning but does not provide any explicit statement or link for the open-source code of its own proposed methodology.
Open Datasets	Yes	We consider the w8a dataset (Platt, 1999) (d = 300, n = 49749).
Dataset Splits	No	The paper does not specify explicit training, validation, or test splits for the dataset used. It mentions reaching a 'target accuracy' as a stopping criterion, but no detailed split information.
Hardware Specification	Yes	For completeness, we report that all experiments were run on an an Ubuntu 16.04 machine with a 24 cores processor Intel R Xeon R CPU E5-2680 v3 @ 2.50GHz.
Software Dependencies	No	The paper mentions the operating system ('Ubuntu 16.04') but does not specify any programming languages, libraries, or frameworks with their version numbers that were used for implementing the experiments.
Experiment Setup	Yes	We initialize all runs with x0 = 0d and measure the number of iterations to reach the target accuracy ϵ. ... By extensive grid search we determine for each conﬁguration (H, K, B) the best stepsize from the set {min(32, cn t+1), 32c}, where c can take the values c = 2i for i Z. ... The regularization parameter is set to λ = 1/n.