reproducibilityindex.ai

Straggler Mitigation in Distributed Optimization Through Data Encoding

Authors: Can Karakus, Yifan Sun, Suhas Diggavi, Wotao Yin

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide experimental results demonstrating the advantage of the approach over uncoded and data replication strategies.
Researcher Affiliation	Collaboration	Can Karakus UCLA Los Angeles, CA karakus@ucla.edu Yifan Sun Technicolor Research Los Altos, CA Yifan.Sun@technicolor.com Suhas Diggavi UCLA Los Angeles, CA suhasdiggavi@ucla.edu Wotao Yin UCLA Los Angeles, CA wotaoyin@math.ucla.edu
Pseudocode	No	The paper describes algorithms textually but does not include structured pseudocode or an algorithm block.
Open Source Code	No	The paper refers to an arXiv preprint but does not provide an explicit statement or link to the source code for the described methodology.
Open Datasets	Yes	Matrix factorization on Movielens 1-M dataset [18] for the movie recommendation task.
Dataset Splits	Yes	We withhold randomly 20% of these ratings to form an 80/20 train/test split.
Hardware Specification	Yes	We implement distributed L-BFGS as described in Section 3 on an Amazon EC2 cluster using the mpi4py Python package, over m = 32 m1.small worker node instances, and a single c3.8xlarge central server instance. The Movielens experiment is run on a single 32-core machine with 256 GB RAM.
Software Dependencies	No	The paper mentions 'mpi4py Python package' and 'using the built-in function numpy.linalg.solve' but does not provide specific version numbers for software dependencies.
Experiment Setup	Yes	for regularization parameter λ = 0.05. We evaluate column-subsampled Hadamard matrix with redundancy β = 2 (encoded using FWHT for fast encoding)... which are aggregated over 20 trials. We choose µ = 3, p = 15, and λ = 10...