reproducibilityindex.ai

Order Matters: Sequence to sequence for sets

Authors: Oriol Vinyals, Samy Bengio, Manjunath Kudlur

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show empirical evidence of our claims regarding ordering, and on the modiﬁcations to the seq2seq framework on benchmark language modeling and parsing tasks, as well as two artiﬁcial tasks sorting numbers and estimating the joint probability of unknown graphical models. The out-of-sample accuracies (whether we succeeded in sorting all numbers or not) of these experiments are summarized in Table 1.
Researcher Affiliation	Industry	Oriol Vinyals, Samy Bengio, Manjunath Kudlur Google Brain {vinyals, bengio, keveman}@google.com
Pseudocode	No	No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	Our model, which naturally handles input sets, has three components (the exact equations and implementation will be released in an appendix prior to publication):
Open Datasets	Yes	For this experiment, we use the Penn Tree Bank, which is a standard language modeling benchmark.
Dataset Splits	Yes	The results for both natural and reverse matched each other at 86 perplexity on the development set (using the same setup as Zaremba et al. (2014)).
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments were provided in the paper.
Software Dependencies	No	The paper describes the use of LSTMs and neural network components but does not provide specific software names with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x') that would be necessary for reproducibility.
Experiment Setup	Yes	We trained medium sized LSTMs with large amounts of regularization (see medium model from Zaremba et al. (2014)) to estimate probabilities over sequences of words. For each problem, we trained two LSTMs for 10,000 mini-batch iterations to model the joint probability, one where the head random variable was shown ﬁrst, and one where it was shown last. All the reported accuracies are shown after reaching 10000 training iterations, at which point all models had converged but none overﬁtted.