reproducibilityindex.ai

Deriving Neural Architectures from Sequence and Graph Kernels

Authors: Tao Lei, Wengong Jin, Regina Barzilay, Tommi Jaakkola

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we apply the proposed sequence and graph modules to various tasks and empirically evaluate their performance against other neural network models. These tasks include language modeling, sentiment classiﬁcation and molecule regression.
Researcher Affiliation	Academia	1MIT Computer Science & Artiﬁcial Intelligence Laboratory.
Pseudocode	No	The paper describes the neural operations using mathematical equations (e.g., Eq. 4, 6, 8, 9) but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	1Code available at https://github.com/taolei87/icml17 knn
Open Datasets	Yes	We use the Penn Tree Bank (PTB) corpus as the benchmark." and "We use the Stanford Sentiment Treebank benchmark (Socher et al., 2013)." and "We further evaluate our graph NN models on the Harvard Clean Energy Project benchmark, which has been used in Dai et al. (2016); Duvenaud et al. (2015) as their evaluation dataset.
Dataset Splits	Yes	We use the standard train/development/test split of this dataset with vocabulary of size 10,000.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used to run its experiments (e.g., specific GPU/CPU models, memory, or cloud instance types).
Software Dependencies	No	The paper mentions optimizers like SGD and Adam, and techniques like dropout, but does not provide specific software dependencies (e.g., library names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	Following standard practice, we use SGD with an initial learning rate of 1.0 and decrease the learning rate by a constant factor after a certain epoch. We back-propagate the gradient with an unroll size of 35 and use dropout (Hinton et al., 2012) as the regularization." and "Our best model is a 3-layer network with n = 2 and hidden dimension d = 200. ... The model is optimized with Adam (Kingma & Ba, 2015), and dropout probability of 0.35.