reproducibilityindex.ai

Distributed Machine Learning through Heterogeneous Edge Systems

Authors: Hanpeng Hu, Dan Wang, Chuan Wu7179-7186

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our testbed implementation and experiments show that ADSP outperforms existing parameter synchronization models signiﬁcantly in terms of ML model convergence time, scalability and adaptability to large heterogeneity. 5 Performance Evaluation We implement ADSP as a ready-to-use Python library based on Tensor Flow (Abadi et al. 2016), and evaluate its performance with testbed experiments.
Researcher Affiliation	Academia	Hanpeng Hu,1 Dan Wang,2 Chuan Wu1 1The University of Hong Kong, 2The Hong Kong Polytechnic University
Pseudocode	Yes	Algorithm 1 Commit Rate Adjustment at the Scheduler
Open Source Code	No	The paper states 'We implement ADSP as a ready-to-use Python library based on Tensor Flow', but does not provide any link or explicit statement about making this library open-source or publicly available.
Open Datasets	Yes	(i) image classiﬁcation on Cifar-10 (Krizhevsky and Hinton 2010) using a CNN model from the Tensor Flow tutorial (Tensorﬂow 2019)
Dataset Splits	No	The paper mentions using the Cifar-10 dataset and training with mini-batches, but it does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., percentages or sample counts).
Hardware Specification	Yes	Testbed. We emulate heterogeneous edge systems following the distribution of hardware conﬁgurations of edge devices in a survey (Jkielty 2019), using 19 Amazon EC2 instances (Wang and Ng 2010): 7 t2.large instances, 5 t2.xlarge instances, 4 t2.2xlarge instances and 2 t3.xlarge instances as workers, and 1 t3.2xlarge instance as the PS.
Software Dependencies	No	The paper states it is 'based on Tensor Flow', but does not provide a specific version number for Tensor Flow or any other software dependencies with their versions.
Experiment Setup	Yes	Default Settings. By default, each mini-batch in our model training includes 128 examples. The check period of ADSP is 60 seconds, and each epoch is 20 minutes long. The global learning rate is 1/M (which we ﬁnd works well through experiments). The local learning rate is initialized to 0.1 and decays exponentially over time.