Spotlight: Optimizing Device Placement for Training Deep Neural Networks

Authors: Yuanxiang Gao, Li Chen, Baochun Li

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have implemented Spotlight in the CIFAR-10 benchmark and deployed it on the Google Cloud platform. Extensive experiments have demonstrated that the training time with placements recommended by Spotlight is 60.9% of that recommended by the policy gradient method.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, University of Toronto 2School of Communication and Information Engineering, University of Electronic Science and Technology of China.
Pseudocode Yes Algorithm 1 Spotlight algorithm
Open Source Code No The paper does not provide a direct link or explicit statement about the availability of the source code for the Spotlight algorithm itself. While it mentions and links to external benchmarks (TensorFlow CIFAR-10, NMT, RNNLM), it does not release its own implementation code.
Open Datasets Yes We have implemented Spotlight in the CIFAR-10 image classification benchmark (CNN). ... To demonstrate the generality of performance improvement achieved by Spotlight, we have evaluated it with two more datasets: the Tensor Flow Neural Machine Translation (NMT) (Wu et al., 2016; NMT) and the Tensor Flow RNN language model (RNNLM) (Jozefowicz et al., 2016; RNN).
Dataset Splits No The paper does not explicitly state the specific training, validation, and test dataset splits used for reproduction, although it mentions standard benchmarks like CIFAR-10.
Hardware Specification Yes We have conducted our experiments with 10 machines on the Google Cloud platform. The machines are equipped with one Intel Broadwell 8-core CPU and either two or four NVIDIA Tesla K80 GPUs each.
Software Dependencies No The paper mentions using “Tensor Flow” but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes The policy π in Spotlight is represented by a two-layer sequence-to-sequence recurrent neural network (RNN)... The policy π is initialized with uniformly random distributions, and the hyperparameter β is set as the typical value of 1... Spotlight performs ten stochastic gradient ascent (SGA) steps on this objective... train the DNN for ten steps.