reproducibilityindex.ai

The Non-IID Data Quagmire of Decentralized Machine Learning

Authors: Kevin Hsieh, Amar Phanishayee, Onur Mutlu, Phillip Gibbons

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we take a step toward better understanding this challenge by presenting a detailed experimental study of decentralized DNN training on a common type of data skew: skewed distribution of data labels across devices/locations.
Researcher Affiliation	Collaboration	1Microsoft Research 2Carnegie Mellon University 3ETH Zürich.
Pseudocode	Yes	Gaia (Hsieh et al., 2017)... (Algorithm 1 in Appendix A1). Federated Averaging (Mc Mahan et al., 2017)... (Algorithm 2 in Appendix A). Deep Gradient Compression (Lin et al., 2018)... (Algorithm 3 in Appendix A).
Open Source Code	Yes	All source code and settings are available at https: //github.com/kevinhsieh/non_iid_dml.
Open Datasets	Yes	We use two datasets, CIFAR10 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015)... To facilitate further study on skewed label partitions, we release a real-world, geo-tagged dataset of common mammals on Flickr (Flickr), which is openly available at https://doi.org/10.5281/ zenodo.3676081 ( 2.2).
Dataset Splits	Yes	We use the default validation set of each of the two datasets to quantify the validation accuracy as our model quality metric... We control the skewness by controlling the fraction of data that are non-IID. For example, 20% non-IID indicates 20% of the dataset is partitioned by labels, while the remaining 80% is partitioned uniformly at random.
Hardware Specification	No	The paper mentions running experiments on a "GPU parameter server system" but does not provide specific hardware details such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using "Caffe" but does not specify a version number or list other software dependencies with version information.
Experiment Setup	Yes	For all applications, we tune the training parameters (e.g., learning rate, minibatch size, number of epochs, etc.)... Appendix C lists all major training parameters in our study.