An Improved Analysis of Gradient Tracking for Decentralized Machine Learning

Authors: Anastasiia Koloskova, Tao Lin, Sebastian U. Stich

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify this dependence in numerical experiments.
Researcher Affiliation Academia Anastasia Koloskova EPFL anastasia.koloskova@epfl.ch Tao Lin EPFL tao.lin@epfl.ch Sebastian U. Stich EPFL sebastian.stich@epfl.ch Current affiliation: CISPA Helmholtz Center for Information Security.
Pseudocode Yes Algorithm 1 GRADIENT TRACKING
Open Source Code No The paper does not provide any concrete access to source code for the methodology described, nor does it explicitly state that the code is being released or is available.
Open Datasets No We consider simple quadratic functions defined as fi(x) = x 2, and x(0) is randomly initialized from a normal distribution N(0, 1). We add artificially stochastic noise to gradients as Fi(x, ξ) = fi(x, ξ) + ξ, where ξ N(0, σ2d I) so that Assumption 4 is satisfied.
Dataset Splits No The paper uses a synthetic experimental setup and does not specify dataset split information for training, validation, or testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment within the main text or the experimental setup description. While Mathematica is cited, it's unclear if it's the primary experimental software dependency for reproduction, and no other dependencies are listed with versions.
Experiment Setup Yes We fix the stepsize γ to be constant, vary p and c and measure the value of f( x(t)) f that GT reaches after a large number of steps. For a fixed n = 300 number of nodes with d = 100 we vary the value of a parameter p by interpolating the ring topology (with uniform weights) with the fully-connected graph. We take the ring topology on a fixed number of n = 300 nodes and reduce the self-weights to achieve different values of c (see appendix for details). Otherwise the setup is as above.