reproducibilityindex.ai

Towards Tight Communication Lower Bounds for Distributed Optimisation

Authors: Janne H. Korhonen, Dan Alistarh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our main result provides the ﬁrst fully unconditional bounds on total number of bits which need to be sent and received by the N machines to solve this problem under point-to-point communication, within a given error-tolerance. Speciﬁcally, we show that (Nd log d/N") total bits need to be communicated between the machines to ﬁnd an additive -approximation to the minimum of PN i=1 fi(x). The result holds for both deterministic and randomised algorithms, and, importantly, requires no assumptions on the algorithm structure.
Researcher Affiliation	Collaboration	Dan Alistarh IST Austria & Neural Magic, Inc. dan.alistarh@ist.ac.at Janne H. Korhonen IST Austria janne.korhonen@ist.ac.at
Pseudocode	Yes	The algorithm proceeds in rounds t = 1, 2, . . . , T. At the beginning of round t + 1, each node i knows the values of the iterate x(t), the global quantisation estimate q(t), and its local quantisation estimate q(t) i for i = 1, 2, . . . , N. At step t, nodes perform the following steps: (1) Each node i updates its iterate as x(t+1) = x(t) γq(t). (2) Each node i computes its local gradient over x(t+1), and transmits it in quantised form to the coordinator as follows. Let "1 = δR(t+1)/2N and 1 = R(t+1)/N. (a) Node i computes rfi(x(t+1)) locally, and sends message mi = enc"1, 1(rfi(x(t+1))) to the coordinator. (b) The coordinator receives messages mi for i = 1, 2, . . . , N, and decodes them as q(t+1) i = dec"1, 1(q(t) i , mi). The coordinator then computes r(t+1) = PN i . (3) The coordinator sends the quantised sum of gradients to all other nodes as follows. Let "2 = δR(t+1)/2 and 2 = 2R(t+1). (a) The coordinator sends the message m = enc"2, 2(r(t+1)) to each node i. (b) Each node decodes the coordinator s message as q(t+1) = dec"2, 2(q(t), m).
Open Source Code	No	The paper does not contain any explicit statements or links indicating the release of open-source code for the described methodology.
Open Datasets	No	The paper discusses mathematical bounds and theoretical algorithms for optimizing quadratic functions (x β0kx x0k2 2), not the use of specific public datasets for training. Thus, no information on dataset availability is provided.
Dataset Splits	No	The paper discusses theoretical bounds and algorithms. It does not mention using specific datasets or providing training/test/validation splits, as it does not conduct empirical experiments.
Hardware Specification	No	The paper is theoretical and focuses on communication complexity, algorithms, and proofs. It does not describe any empirical experiments or the hardware used to run them.
Software Dependencies	No	The paper is theoretical and focuses on mathematical proofs and algorithm design. It does not discuss specific software dependencies or their version numbers for implementation.
Experiment Setup	No	The paper focuses on theoretical contributions (lower bounds and algorithm design with analysis) rather than empirical experiments. Therefore, no specific experimental setup details such as hyperparameters or training configurations are provided.