reproducibilityindex.ai

On-the-fly Operation Batching in Dynamic Computation Graphs

Authors: Graham Neubig, Yoav Goldberg, Chris Dyer

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we describe our experiments, designed to answer three main questions: (1) in situations where manual batching is easy, how close can the proposed method approach the efﬁciency of a program that uses hand-crafted manual batching, and how do the depth-based and agenda-based approaches compare ( 4.1)? (2) in situations where manual batching is less easy, is the proposed method capable of obtaining signiﬁcant improvements in efﬁciency ( 4.2)? (3) how does the proposed method compare to Tensor Flow Fold, an existing method for batching variably structured networks within a static declaration framework ( 4.3)?
Researcher Affiliation	Collaboration	Graham Neubig Language Technologies Institute Carnegie Mellon University gneubig@cs.cmu.edu Yoav Goldberg Computer Science Department Bar-Ilan University yogo@cs.biu.ac.il Deep Mind cdyer@google.com
Pseudocode	Yes	Pseudo-code for constructing the graph for each of the RNNs on the left using a dynamic declaration framework is as follows: function RNN-REGRESSION-LOSS(x1:n, y; (W, U, b, c) = ) ... and function TRAIN-BATCH-NAIVE(T = {(x(i) 1:n(i), y(i))}b i=1; ) NEW-GRAPH() ... and function RNN-REGRESSION-BATCH-LOSS(X1:nmax, Y, n(1:b); (W, U, b, c) = ) ... and function TRAIN-BATCH-MANUAL(T = {(x(i) 1:n(i), y(i))}b i=1; ) nmax = maxi n(i) ...
Open Source Code	Yes	The proposed algorithm is implemented in Dy Net (http://dynet.io/), and can be activated by using the --dynet-autobatch 1 command line ﬂag.
Open Datasets	Yes	images of a ﬁxed size such those in the MNIST and CIFAR datasets and we train a model on a bi-directional LSTM sequence labeler [12, 23], on synthetic data where every sequence to be labeled is the same length (40). and We compare the Tensor Flow Fold reference implementation of the Stanford Sentiment Treebank regression task [30].
Dataset Splits	No	The paper mentions The batch size is 64 and the use of synthetic data and actual variable length sequences, but it does not provide specific percentages or counts for training, validation, or test splits. It only mentions evaluation of the dev set in the context of a comparison with Tensor Flow Fold, not for their own experimental setup.
Hardware Specification	Yes	Experiments were run on a single Tesla K80 GPU or Intel Xeon 2.30GHz E5-2686v4 CPU.
Software Dependencies	No	The paper mentions toolkits like PyTorch, DyNet, Chainer, Tensor Flow, CNTK, and Theano, but it does not specify version numbers for any of these software dependencies.
Experiment Setup	Yes	The batch size is 64. and The network takes as input a size 200 embedding vector from a vocabulary of size 1000, has 2 layers of 256 hidden node LSTMs in either direction, then predicts a label from one of 300 classes.