reproducibilityindex.ai

A simple neural network module for relational reasoning

Authors: Adam Santoro, David Raposo, David G. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested RN-augmented networks on three tasks: visual question answering using a challenging dataset called CLEVR, on which we achieve state-of-the-art, super-human performance; textbased question answering using the b Ab I suite of tasks; and complex reasoning about dynamic physical systems.
Researcher Affiliation	Industry	Adam Santoro adamsantoro@google.com David Raposo draposo@google.com David G.T. Barrett barrettdavid@google.com Mateusz Malinowski mateuszm@google.com Razvan Pascanu razp@google.com Peter Battaglia peterbattaglia@google.com Timothy Lillicrap Deep Mind London, United Kingdom countzero@google.com
Pseudocode	No	The paper describes the architecture and functions mathematically and in text but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states that the Sort-of-CLEVR dataset will be made publicly available, but there is no explicit statement or link indicating that the source code for their methodology is open-source or publicly available.
Open Datasets	Yes	We used two versions of the CLEVR dataset: (i) the pixel version, in which images were represented in standard 2D pixel form. (ii) a state description version, in which images were explicitly represented by state description matrices containing factored object descriptions. The Sort-of-CLEVR dataset will be made publicly available online. Our model was trained on the joint version of b Ab I (all 20 tasks simultaneously), using the full dataset of 10K examples per task.
Dataset Splits	Yes	Our model was trained on the joint version of b Ab I (all 20 tasks simultaneously), using the full dataset of 10K examples per task. Our model achieved state-of-the-art performance on CLEVR at 95.5%, exceeding the best model trained only on the pixel images and questions at the time of the dataset s publication by 27%, and surpassing human performance in the task (see Table 1 and Figure 3). The model we evaluated was chosen based on overall performance on a withheld validation set, using a single seed.
Hardware Specification	No	The paper mentions 'distributed training with 10 workers synchronously updating a central parameter server' but does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for the experiments.
Software Dependencies	No	The paper mentions the use of 'Adam optimizer' and 'Re LU non-linearities', but it does not specify versions for any programming languages, libraries, or frameworks (e.g., Python version, TensorFlow/PyTorch version).
Experiment Setup	Yes	For the CLEVR-from-pixels task we used: 4 convolutional layers each with 24 kernels, Re LU non-linearities, and batch normalization; 128 unit LSTM for question processing; 32 unit word-lookup embeddings; four-layer MLP consisting of 256 units per layer with Re LU non-linearities for gθ; and a three-layer MLP consisting of 256, 256 (with 50% dropout), and 29 units with Re LU non-linearities for fφ. The final layer was a linear layer that produced logits for a softmax over the answer vocabulary. The softmax output was optimized with a cross-entropy loss function using the Adam optimizer with a learning rate of 2.5e 4. We used size 64 mini-batches and distributed training with 10 workers synchronously updating a central parameter server.