Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Leveraging partial stragglers within gradient coding

Authors: Aditya RAMAMOORTHY, Ruoyu Meng, Vrinda Girimaji

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present comparisons with the original GC protocol via a simulation of the distributed system for both the exact and approximate GC scenarios.
Researcher Affiliation Academia Aditya Ramamoorthy Ruoyu Meng Vrinda S. Girimaji Department of Electrical and Computer Engineering Iowa State University Ames, IA 50010. EMAIL
Pseudocode Yes Algorithm 1 Find-Encoding-Coeff
Open Source Code Yes All software code for recreating these results can be found at [40]. [40] Aditya Ramamoorthy, Ruoyu Meng, and Vrinda S. Girimaji. Leveraging partial stragglers within gradient coding software repository. https://github.com/flamethrower775/Leveraging-partial-stragglers-within-gradient-coding, 2024.
Open Datasets No The paper describes how the datasets (random regular graphs) were generated for the simulations: 'We considered two different random regular graphs, Gi, i = 1, 2 with sizes m = 200, 300 and degree ฮฝ = 8. These graphs... were used in [6]'. While a reference is provided for the *type* of graph, there is no direct link, DOI, or specific citation to a pre-existing public dataset used for training or evaluation in their experiments.
Dataset Splits No The paper describes its simulation setup, including parameters for generating graphs and simulating failures, and it mentions performing '1000 simulations'. However, it does not specify any explicit training, validation, or test dataset splits in terms of percentages or sample counts.
Hardware Specification Yes These simulations were performed within a Mac Book Pro (M1 chip, 16 GB RAM).
Software Dependencies No The paper mentions that 'All software code for recreating these results can be found at [40],' which implies the use of programming languages and libraries. However, it does not explicitly list any specific software dependencies (e.g., Python, PyTorch, NumPy) along with their version numbers.
Experiment Setup Yes We generated a random vector of length-m where ฮฑ workers uniformly at random are chosen to be failed (ฮฑ is chosen based on the scenario). For the other workers, the amount of time taken to process a chunk is chosen i.i.d. from an exponential distribution with parameter ยต = 1. The entries of the matrix R are chosen i.i.d. from the standard normal N(0, 1).