Attention over Learned Object Embeddings Enables Complex Visual Reasoning

Authors: David Ding, Felix Hill, Adam Santoro, Malcolm Reynolds, Matt Botvinick

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We tested Aloe on three datasets, CLEVRER [41], CATER [12], and ACRE [44]. For each dataset, we pretrained a MONet model on individual frames. More training details and a table of hyperparameters are given in Appendix A.3; these hyperparameters were obtained through a hyperparameter sweep. All error bars are standard deviations computed over at least 5 random seeds.
Researcher Affiliation Industry David Ding Felix Hill Adam Santoro Malcolm Reynolds Matt Botvinick Deep Mind London, United Kingdom {fding, felixhill, adamsantoro, mareynolds, botvinick}@google.com
Pseudocode Yes In pseudo-code, global attention can be expressed as out = transformer(reshape(objects, [B, F * N, D]) and hiearchical attention as out = transformer1(reshape(objects, [B * F, N, D])) out = transformer2(reshape(out, [B, F, N * D])) .
Open Source Code Yes Model Code: https://github.com/deepmind/deepmind-research/tree/master/ object_attention_for_reasoning.
Open Datasets Yes We tested Aloe on three datasets, CLEVRER [41], CATER [12], and ACRE [44].
Dataset Splits No The paper states 'More training details and a table of hyperparameters are given in Appendix A.3; these hyperparameters were obtained through a hyperparameter sweep.' and mentions training on 'N% of the videos and their associated labeled data', but does not explicitly provide specific train/validation/test dataset split percentages or counts in the main body.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running the experiments.
Software Dependencies No The paper does not explicitly provide specific software dependencies (e.g., library names with version numbers) needed to replicate the experiment.
Experiment Setup Yes More training details and a table of hyperparameters are given in Appendix A.3; these hyperparameters were obtained through a hyperparameter sweep.