Learning Representations that Support Extrapolation

Authors: Taylor Webb, Zachary Dulberg, Steven Frankland, Alexander Petrov, Randall O’Reilly, Jonathan Cohen

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate TCN, in addition to a number of competitive alternative techniques, on the VAEC dataset, the visual analogy dataset from Hill et al. (2019), and a dynamic object prediction task. We find that TCN yields a considerable improvement in the ability to extrapolate in each of these task domains.
Researcher Affiliation Academia 1Department of Psychology, University of California Los Angeles, Los Angeles, CA 2Princeton Neuroscience Institute, Princeton, NJ 3Department of Psychology, The Ohio State University, Columbus, OH 4Department of Psychology, University of California Davis, Davis, CA.
Pseudocode No The paper describes the proposed method (TCN) and models using textual descriptions and mathematical equations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about making the source code for the described methodology publicly available, nor does it provide a link to a code repository.
Open Datasets Yes We also evaluated TCN on the extrapolation regime from the visual analogy dataset in Hill et al. (2019).
Dataset Splits No The paper specifies training and testing regions/scales for the datasets and mentions training iterations and batch sizes, but does not explicitly provide details about training/validation/test dataset splits or a specific validation set.
Hardware Specification No The paper mentions the use of software frameworks like TensorFlow and PyTorch for simulations, but does not provide any specific details regarding the hardware (e.g., GPU models, CPU types, or cloud infrastructure) used for running the experiments.
Software Dependencies No The paper mentions using TensorFlow and PyTorch for simulations, and the ADAM optimizer, but it does not specify the version numbers for these or any other software dependencies.
Experiment Setup Yes Each network was trained for 10,000 iterations, with a batch size of 32, using the ADAM optimizer (Kingma & Ba, 2014) with a learning rate of 5e 4 (except as otherwise noted in 4.2). All weights were initialized using Xavier uniform initialization (Glorot & Bengio, 2010), and all biases were initialized to zero.