Learning by Abstraction: The Neural State Machine
Authors: Drew Hudson, Christopher D. Manning
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on VQA-CP and GQA, two recent VQA datasets that involve compositionality, multi-step inference and diverse reasoning skills, achieving state-of-the-art results in both cases. We provide further experiments that illustrate the model s strong generalization capacity across multiple dimensions, including novel compositions of concepts, changes in the answer distribution, and unseen linguistic structures, demonstrating the qualities and efficacy of our approach. |
| Researcher Affiliation | Academia | Drew A. Hudson Stanford University 353 Serra Mall, Stanford, CA 94305 dorarad@cs.stanford.edu Christopher D. Manning Stanford University 353 Serra Mall, Stanford, CA 94305 manning@cs.stanford.edu |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | No | The model has been implemented in Tensorflow, and will be released along with the features and instructions for reproducing the described experiments. |
| Open Datasets | Yes | We evaluate our model (NSM) on two recent VQA datasets: (1) The GQA dataset [41] which focuses on real-world visual reasoning and compositional question answering, and (2) VQA-CP (version 2) [3], a recent split of the VQA dataset [27] that has been particularly designed to test generalization skills across changes in the answer distribution between the training and the test sets. |
| Dataset Splits | No | The paper mentions using "validation set" for GQA in Section 6.5, but does not provide specific details on its size, proportion, or how it was split from the main dataset. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states the model was 'implemented in Tensorflow' but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | Both our model and implemented baselines are trained to minimize the cross-entropy loss of the predicted candidate answer (out of the top 2000 possibilities), using a hidden state size of d = 300 and, unless otherwise stated, length of N = 8 computation steps for the MAC and NSM models. Please refer to section 6.5 for further information about the training procedure, implementation details, hyperparameter configuration and data preprocessing... (from Section 4) and We use the Adam optimizer [47] with an initial learning rate of 1e-4, decaying by 0.2 after every 3 epochs... We train the model for 30 epochs with a batch size of 64. (from Section 6.5) |