reproducibilityindex.ai

DeepReduce: A Sparse-tensor Communication Framework for Federated Deep Learning

Authors: Hang Xu, Kelly Kostopoulou, Aritra Dutta, Xin Li, Alexandros Ntoulas, Panos Kalnis

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments with real models demonstrate that Deep Reduce transmits 320% less data than existing sparsiﬁers, without affecting accuracy.
Researcher Affiliation	Academia	KAUST hang.xu@kaust.edu.cn; Kelly Kostopoulou Columbia University kelkost@cs.columbia.edu; Aritra Dutta KAUST aritra.dutta@kaust.edu.cn; Xin Li University of Central Florida xin.li@ucf.edu; Alexandros Ntoulas NKUA antoulas@di.uoa.gr; Panos Kalnis KAUST panos.kalnis@kaust.edu.sa
Pseudocode	Yes	We present the pseudo-code of policy P2 in Algorithm 1, Appendix B.5.
Open Source Code	Yes	Code is available at https://github.com/hangxu0304/Deep Reduce.
Open Datasets	Yes	We employ the popular Fed ML [33] benchmark that uses an LSTM model [59] to perform next-word prediction in a federated learning setting, on the Stack Overﬂow [67] dataset...; Benchmarks. We employ the popular Fed ML [33] benchmark that uses an LSTM model [59] to perform next-word prediction in a federated learning setting, on the Stack Overﬂow [67] dataset with 135,818,730 training and 16,586,035 test examples; Table 1: Benchmarks and datasets; last column shows the best quality achieved by the no-compression baseline. Type Model Task Dataset Parameters Optimizer Platform Metric Baseline CNN Res Net-20 [34] Image classif. CIFAR-10 [48] 269,722 SGD-M [73] TFlow Top-1 Acc. 90.94% Dense Net40-K12 [37] Image classif. CIFAR-10 [48] 357,491 SGD-M [73] TFlow Top-1 Acc. 91.76% Res Net-50 [34] Image classif. Image Net [17] 25,557,032 SGD-M [73] TFlow Top-1 Acc. 73.78% MLP NCF [35] Recommendation Movielens-20M [56] 31,832,577 Adam [46] Py Torch Best Hit Rate 94.97% RNN LSTM[59] Next word pred. Stack Overﬂow[67] 4,053,428 Fed Avg [55] Py Torch Top-1 Acc. 18.56%
Dataset Splits	No	The paper does not explicitly provide training, validation, and test split percentages or counts for any of the datasets used. While it mentions '135,818,730 training and 16,586,035 test examples' for Stack Overflow, it does not specify a validation set or a formal train/test/validation split methodology for other datasets.
Hardware Specification	Yes	Each instance is equipped with a 4-core Intel CPU @ 2.50GHz, 16GB RAM, and an NVIDIA Tesla T4 GPU with 16 GB on-board memory (see Appendix F.1 for details). We also run simulated deployments on a local cluster of 8 nodes, each with a 16-core Intel CPU @ 2.6GHz, 512GB RAM, one NVIDIA Tesla V100 GPU with 16 GB on-board memory and 100Gbps network.
Software Dependencies	No	The paper mentions 'Deep Reduce supports Tensor Flow and Pytorch' but does not specify their version numbers or any other software dependencies with version information.
Experiment Setup	Yes	Each client executes 1 local epoch; the learning rate is 0.3 and the batch size is 16.