reproducibilityindex.ai

Exploring Hidden Dimensions in Accelerating Convolutional Neural Networks

Authors: Zhihao Jia, Sina Lin, Charles R. Qi, Alex Aiken

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our evaluation shows that layer-wise parallelism outperforms state-of-the-art approaches by increasing training throughput, reducing communication costs, achieving better scalability to multiple GPUs, while maintaining original network accuracy.
Researcher Affiliation	Collaboration	1Stanford University 2Microsoft. Correspondence to: Zhihao Jia <zhihao@cs.stanford.edu>.
Pseudocode	Yes	Algorithm 1 shows pseudocode using node and edge eliminations as subroutines to ﬁnd an optimal parallelization strategy under our cost model.
Open Source Code	No	The paper states 'we implemented our framework in Legion...' but does not explicitly provide a concrete access link or statement about releasing the source code for their implementation.
Open Datasets	Yes	We evaluate the runtime performance of all three CNNs on the Image Net-1K dataset (Deng et al., 2009) that consists of 1.2 million images from 1,000 categories.
Dataset Splits	No	No specific details regarding training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit mention of standard splits for the used datasets) were found.
Hardware Specification	Yes	All experiments were performed on a GPU cluster with 4 compute nodes, each of which is equipped with two Intel 10-core E5-2600 CPUs, 256G main memory, and four NVIDIA Tesla P100 GPUs.
Software Dependencies	Yes	We ran data parallelism experiments in Tensor Flow r1.7, Py Torch v0.3, and our implementation and compared the runtime performance.
Experiment Setup	Yes	We use synchronous training and a per-GPU batch size of 32 for all experiments.