Learning to Make Analogies by Contrasting Abstract Relational Structure
Authors: Felix Hill, Adam Santoro, David Barrett, Ari Morcos, Timothy Lillicrap
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our first experiments involve greyscale visual scenes similar to those previously applied to test both human reasoning ability (Raven, 1983; Geary et al., 2000) and reasoning in machine learning models (Bongard, 1967; Fleuret et al., 2011; Barrett et al., 2018). Table 1: Test performance of a selection of visual reasoning network architectures trained in the normal regime and with LABC on the Novel Domain Transfer experiment. |
| Researcher Affiliation | Industry | Deepmind, London {felixhill,adamsantoro,barrettdavid,arimorcos,countzero}@google.com |
| Pseudocode | No | The paper describes model architectures and training procedures in text, such as 'Our model consisted of a simple perceptual front-end a convolutional neural network (CNN) which provided input for a recurrent neural network (RNN)', but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides a link for the dataset ('The visual analogy dataset can be downloaded from https://github.com/deepmind/ abstract-reasoning-matrices') but does not explicitly state that the source code for the models or methods described in the paper is available. |
| Open Datasets | Yes | The visual analogy dataset can be downloaded from https://github.com/deepmind/ abstract-reasoning-matrices |
| Dataset Splits | Yes | Note that in all experiments reported below, we generated 600,000 training questions, 10,000 validation questions and test sets of 100,000 questions. |
| Hardware Specification | No | The paper describes computational models and training parameters but does not provide specific details on the hardware used to run the experiments, such as GPU models or processor types. |
| Software Dependencies | No | The paper mentions optimizers and loss functions, such as 'Adam optimizer' and 'cross entropy loss function', but does not specify programming languages, libraries, or their version numbers that would be required to reproduce the experiments. |
| Experiment Setup | Yes | The CNN was 4-layers deep, with 32 kernels per layer, each of size 3x3 with a stride of 2... The RNN processed the source sequence embeddings... with 64 hidden units... We used a cross entropy loss function and the Adam optimizer with a learning rate of 1e-4. We used batch sizes of 32 and the Adam optimizer with a learning rate of 3e-4. |