reproducibilityindex.ai

Learning by Abstraction: The Neural State Machine

Authors: Drew Hudson, Christopher D. Manning

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our model on VQA-CP and GQA, two recent VQA datasets that involve compositionality, multi-step inference and diverse reasoning skills, achieving state-of-the-art results in both cases. We provide further experiments that illustrate the model s strong generalization capacity across multiple dimensions, including novel compositions of concepts, changes in the answer distribution, and unseen linguistic structures, demonstrating the qualities and efﬁcacy of our approach.
Researcher Affiliation	Academia	Drew A. Hudson Stanford University 353 Serra Mall, Stanford, CA 94305 dorarad@cs.stanford.edu Christopher D. Manning Stanford University 353 Serra Mall, Stanford, CA 94305 manning@cs.stanford.edu
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	No	The model has been implemented in Tensorﬂow, and will be released along with the features and instructions for reproducing the described experiments.
Open Datasets	Yes	We evaluate our model (NSM) on two recent VQA datasets: (1) The GQA dataset [41] which focuses on real-world visual reasoning and compositional question answering, and (2) VQA-CP (version 2) [3], a recent split of the VQA dataset [27] that has been particularly designed to test generalization skills across changes in the answer distribution between the training and the test sets.
Dataset Splits	No	The paper mentions using "validation set" for GQA in Section 6.5, but does not provide specific details on its size, proportion, or how it was split from the main dataset.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper states the model was 'implemented in Tensorﬂow' but does not provide specific version numbers for TensorFlow or any other software dependencies.
Experiment Setup	Yes	Both our model and implemented baselines are trained to minimize the cross-entropy loss of the predicted candidate answer (out of the top 2000 possibilities), using a hidden state size of d = 300 and, unless otherwise stated, length of N = 8 computation steps for the MAC and NSM models. Please refer to section 6.5 for further information about the training procedure, implementation details, hyperparameter conﬁguration and data preprocessing... (from Section 4) and We use the Adam optimizer [47] with an initial learning rate of 1e-4, decaying by 0.2 after every 3 epochs... We train the model for 30 epochs with a batch size of 64. (from Section 6.5)