Communication-Computation Efficient Gradient Coding

Authors: Min Ye, Emmanuel Abbe

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Implementations are made on Amazon EC2 clusters using Python with mpi4py package. Results show that the proposed scheme maintains the same generalization error while reducing the running time by 32% compared to uncoded schemes and 23% compared to prior coded schemes focusing only on stragglers (Tandon et al., ICML 2017).
Researcher Affiliation Academia 1Department of Electrical Engineering, Princeton University, Princeton, NJ 08544, USA 2Program in Applied and Computational Mathematics and Department of Electrical Engineering, Princeton University, and the School of Mathematics, Institute for Advanced Study, Princeton, NJ 08544, USA.
Pseudocode No The paper describes its coding scheme in detail using mathematical equations and explanations in Section 3, but it does not present it in a structured pseudocode or algorithm block format.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets Yes We used Python with mpi4py package to implement our gradient coding schemes proposed in Section 3, where we chose {θ1, θ2, . . . , θn} as { (1 + i/2), i = 0, 1, 2, . . . , n/2 1} for even n and {0, (1 + i/2), i = 0, 1, 2, . . . , (n 1)/2 1} for odd n. We used t2.micro instances on Amazon EC2 as worker nodes and a single c3.8xlarge instance as the master node.
Dataset Splits No The paper states using "N = 26220 training samples" but does not specify the explicit percentages or counts for training, validation, or test splits. It refers to "generalization error" but not a specific validation set.
Hardware Specification Yes We used t2.micro instances on Amazon EC2 as worker nodes and a single c3.8xlarge instance as the master node.
Software Dependencies No We used Python with mpi4py package to implement our gradient coding schemes proposed in Section 3. No specific version numbers for Python or mpi4py are provided.
Experiment Setup Yes We used N = 26220 training samples and adopted Nesterov s Accelerated Gradient (NAG) descent (Bubeck, 2015) to train the model. These experiments were run on n = 10, 15, 20 worker nodes. In the Table 3 we take n = k = 8, λ1 = 0.8, λ2 = 0.1, t1 = 1.6, t2 = 6, and we list E[Ttot] for all possible choices of d and m.