Modular Continual Learning in a Unified Visual Environment

Authors: Kevin T. Feigelis, Blue Sheffer, Daniel L. K. Yamins

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In 3 we describe and evaluate comparative performance of multiple Re Ma P module architectures on a variety of Touch Stream tasks. We compared each architecture across 12 variants of visual SR, MTS, and localization tasks, using fixed visual encoding features from layer FC6 of VGG-16.
Researcher Affiliation Academia Kevin T. Feigelis Department of Physics Stanford Neurosciences Institute Stanford University Stanford, CA 94305 feigelis@stanford.edu Blue Sheffer Department of Psychology Stanford University Stanford, CA 94305 bsheffer@stanford.edu Daniel L. K. Yamins Departments of Psychology and Computer Science Stanford Neurosciences Institute Stanford University Stanford, CA 94305 yamins@stanford.edu
Pseudocode Yes Algorithm 1: Re Ma P Reward Map Prediction
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets Yes Although we work with modern large-scale computer vision-style datasets and tasks in this work, e.g. Image Net (Deng et al. (2009)) and MS-COCO (Lin et al. (2014)).
Dataset Splits No Each class has 1300 unique training instances, and 50 unique validation instances." The paper mentions 'cross-validated fashion' for learning rates but does not provide overall training/validation/test dataset splits for reproducibility beyond the number of validation instances for one dataset.
Hardware Specification No The paper does not specify any particular hardware components (e.g., GPU model, CPU type, memory) used for conducting the experiments.
Software Dependencies No The paper mentions algorithms (e.g., ADAM) and network architectures (e.g., VGG-16) and activation functions (e.g., Re LU, CRe LU) but does not specify versions for any key software components or libraries used for implementation.
Experiment Setup Yes Module weights were initialized using a normal distribution with µ = 0.0, σ = 0.01, and optimized using the ADAM algorithm (Kingma & Ba (2014)) with parameters β1 = 0.9, β2 = 0.999 and ϵ = 1e 8. Learning rates were optimized on a per-task, per-architecture basis in a cross-validated fashion. Values used in the present study may be seen in Table S2.