reproducibilityindex.ai

Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming

Authors: Fei Wang, James Decker, Xilun Wu, Gregory Essertel, Tiark Rompf

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Evaluation and Case Studies. As shown in Figure 6, we compared Lantern with Tensor Flow and Py Torch (Dy Net implementation was only introduced for Tree LSTM for the beneﬁt of autobatching). The training loss (not shown) in all architectures had similar decay, indicating that Lantern correctly implements backward propagation. We elected to only gauge the runtime of training loops, as that is the majority of computation.
Researcher Affiliation	Academia	Fei Wang Purdue University West Lafayette, IN 47906 wang603@purdue.edu James Decker Purdue University West Lafayette, IN 47906 decker31@purdue.edu Xilun Wu Purdue University West Lafayette, IN 47906 wu636@purdue.edu Grégory Essertel Purdue University West Lafayette, IN, 47906 gesserte@purdue.edu Tiark Rompf Purdue University West Lafayette, IN, 47906 tiark@purdue.edu
Pseudocode	Yes	Figure 2: Automatic Differentiation in Scala: reverse-mode AD by callbacks and operator overloading (left), and the grad function deﬁnition and use case (right). ... Figure 3: Program Transformation between direct style (left) and CPS (right). ... Figure 4: Automatic Differentiation in Scala: reverse-mode using delimited continuations with shift/reset operators (left), and grad function deﬁnition and use case (right).
Open Source Code	Yes	In this section, we validate our design by implementing and evaluating our prototypic framework, dubbed Lantern2. Lantern builds on the code in earlier sections, but supports handling tensor objects (multi-dimension arrays with common linear algebra operations such as element-wise operations with broadcasting, matrix multiplication, and convolution). ... 2https://github.com/feiwang3311/Lantern
Open Datasets	Yes	We would like to give extra attention to the evaluation of Tree LSTM, which is adapted from Sentiment Classiﬁcation using the dataset from the Stanford Sentiment Treebank (Chuang, 2013) following the work of Tai et al. (2015).
Dataset Splits	No	For vanilla RNN and LSTM, we evaluated at batch size 20. The training time for Lantern in both cases is less compared with that of Py Torch, and comparable to that of Tensor Flow. For CNN, the evaluation was done at batch size 100... As such, both Lantern and Py Torch were run at batch size 1... The paper mentions batch sizes and training, but does not provide explicit training/validation/test dataset splits (e.g., percentages or sample counts) for reproduction.
Hardware Specification	Yes	3All experiments were run using a single CPU on a cluster with Intel Xeon Platinum 8168 CPUs at 2.70GHz and 0.75 TB RAM per node.
Software Dependencies	No	While some operations are linked to the Open BLAS implementation, most operations are implemented as simple C++ loops. ... comparing with Py Torch, Tensor Flow, and Dy Net (Neubig et al., 2017). ... The paper mentions several software components like Open BLAS, C++, Py Torch, Tensor Flow, and Dy Net, but it does not provide specific version numbers for any of these dependencies, which is required for reproducibility.
Experiment Setup	Yes	For vanilla RNN and LSTM, we evaluated at batch size 20. ... For CNN, the evaluation was done at batch size 100... As such, both Lantern and Py Torch were run at batch size 1...