reproducibilityindex.ai

Efficient Rematerialization for Deep Networks

Authors: Ravi Kumar, Manish Purohit, Zoya Svitkina, Erik Vee, Joshua Wang

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate the performance of these algorithms on many common deep learning models. ... We experimentally evaluate the performance of our rematerialization algorithm on computational graphs for training commonly used deep neural networks.
Researcher Affiliation	Industry	Ravi Kumar Google Research Mountain View, CA 94043 ravi.k53@gmail.com Manish Purohit Google Research Mountain View, CA 94043 mpurohit@google.com Zoya Svitkina Google Research Mountain View, CA 94043 zoya@google.com Erik Vee Google Research Mountain View, CA 94043 erikvee@google.com Joshua R. Wang Google Research Mountain View, CA 94043 joshuawang@google.com
Pseudocode	Yes	Algorithm 1: Efﬁcient Rematerialization via Tree Decomposition.
Open Source Code	No	The paper refers to third-party open-source implementations (e.g., official ResNet and Transformer models in TensorFlow) that they used for evaluation, but does not provide specific links or statements about releasing their own implementation code for the rematerialization algorithm.
Open Datasets	Yes	We use the ofﬁcial implementation of the Res Net model for the Image Net task in Tensor Flow. ... (i) Deep Residual Networks (Res Net): We ﬁrst consider deep residual networks (Res Net) [13] as an example of convolutional networks for image classiﬁcation.
Dataset Splits	No	The paper mentions using models like ResNet, FFN, and Transformer, and conducting experiments, but it does not specify explicit training, validation, or test dataset split percentages or counts.
Hardware Specification	No	The paper mentions general hardware such as 'GPUs and AI accelerators' and 'GPU and CPU memory' but does not specify any exact models (e.g., NVIDIA V100, Intel Xeon), quantities, or detailed system specifications used for their experiments.
Software Dependencies	No	The paper mentions using 'TensorFlow' for model implementations and 'XLA' as a baseline, but it does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We use different conﬁgurations to measure the effect of network depth (number of convolutional layers) on memory requirements of schedules obtained by the algorithms. ... For this experiment, we setup a simple feed-forward network with Re LU activations (number of hidden layers is varied) and randomly generated inputs and outputs. We use mean squared error loss and train using standard gradient descent. ... Again, we use the ofﬁcial implementation of Transformer in Tensor Flow with all hyperparameters set to recommended defaults.