Learning Representations that Support Extrapolation
Authors: Taylor Webb, Zachary Dulberg, Steven Frankland, Alexander Petrov, Randall O’Reilly, Jonathan Cohen
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate TCN, in addition to a number of competitive alternative techniques, on the VAEC dataset, the visual analogy dataset from Hill et al. (2019), and a dynamic object prediction task. We find that TCN yields a considerable improvement in the ability to extrapolate in each of these task domains. |
| Researcher Affiliation | Academia | 1Department of Psychology, University of California Los Angeles, Los Angeles, CA 2Princeton Neuroscience Institute, Princeton, NJ 3Department of Psychology, The Ohio State University, Columbus, OH 4Department of Psychology, University of California Davis, Davis, CA. |
| Pseudocode | No | The paper describes the proposed method (TCN) and models using textual descriptions and mathematical equations, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about making the source code for the described methodology publicly available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We also evaluated TCN on the extrapolation regime from the visual analogy dataset in Hill et al. (2019). |
| Dataset Splits | No | The paper specifies training and testing regions/scales for the datasets and mentions training iterations and batch sizes, but does not explicitly provide details about training/validation/test dataset splits or a specific validation set. |
| Hardware Specification | No | The paper mentions the use of software frameworks like TensorFlow and PyTorch for simulations, but does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, or cloud infrastructure) used for running the experiments. |
| Software Dependencies | No | The paper mentions using TensorFlow and PyTorch for simulations, and the ADAM optimizer, but it does not specify the version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | Each network was trained for 10,000 iterations, with a batch size of 32, using the ADAM optimizer (Kingma & Ba, 2014) with a learning rate of 5e 4 (except as otherwise noted in 4.2). All weights were initialized using Xavier uniform initialization (Glorot & Bengio, 2010), and all biases were initialized to zero. |