Compositional Attention Networks for Machine Reasoning

Authors: Drew A. Hudson, Christopher D. Manning

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our model on the recent CLEVR dataset (Johnson et al., 2016). CLEVR is a synthetic dataset consisting of 700K tuples; each consists of a 3D-rendered image featuring objects of various shapes, colors, materials and sizes, coupled with compositional multi-step questions that measure performance on an array of challenging reasoning skills such as following transitive relations, counting objects and comparing their properties.
Researcher Affiliation Academia Drew A. Hudson Department of Computer Science Stanford University dorarad@cs.stanford.edu Christopher D. Manning Department of Computer Science Stanford University manning@cs.stanford.edu
Pseudocode No The paper provides detailed descriptions of the model's architecture and mathematical formulations in equations, but it does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code No A Tensor Flow implementation of the network, along with pretrained models will be made publicly available.
Open Datasets Yes We evaluate our model on the recent CLEVR dataset (Johnson et al., 2016).
Dataset Splits Yes We evaluate our model on the recent CLEVR dataset (Johnson et al., 2016).
Hardware Specification Yes The training process takes roughly 10-20 hours on a single Titan X GPU.
Software Dependencies No The paper mentions using TensorFlow for implementation and GloVE for word embeddings, but it does not provide specific version numbers for these or other software libraries used in the experiments.
Experiment Setup Yes We use MAC network with p = 12 cells, and train it using Adam (Kingma & Ba, 2014), with learning rate 10 4. We train our model for 10 20 epochs, with batch size 64, and use early stopping based on validation accuracies. During training, the moving averages of all weights of the model are maintained with the exponential decay rate of 0.999. At test time, the moving averages instead of the raw weights are used. We use dropout 0.85, and ELU (Clevert et al., 2015)...