reproducibilityindex.ai

Been There, Done That: Meta-Learning with Episodic Recall

Authors: Samuel Ritter, Jane Wang, Zeb Kurth-Nelson, Siddhant Jayakumar, Charles Blundell, Razvan Pascanu, Matthew Botvinick

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested the capabilities of L2RL agents equipped with ep LSTM ( ep L2RL agents ) in ﬁve experiments. Experiments 1-3 use multi-armed bandits, ﬁrst exploring the basic case where tasks reoccur in their entirety and are identiﬁed by exactly reoccurring contexts (Exp. 1), then, the more difﬁcult challenge wherein contexts are drawn from Omniglot categories and vary in appearance with each reoccurrence (Exp. 2), and then, the more complex scenario where task components reoccur in arbitrary combinations (Exp. 3). Experiment 4 uses a water maze navigation task to assess ep L2RL s ability to handle multi-state MDPs, and Experiment 5 uses a task from the neuroscience literature to examine the learning algorithms ep L2RL learns to execute.
Researcher Affiliation	Collaboration	1Deep Mind, London, UK 2Princeton Neuroscience Institute, Princeton, NJ 3MPS-UCL Centre for Computational Psychiatry, London, UK 4Gatsby Computational Neuroscience Unit, UCL, London, UK.
Pseudocode	No	The paper describes the model architecture in text and through a diagram (Figure 2), but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	No	The acknowledgements section mentions the use of an asynchronous RL codebase and a DND library, but there is no explicit statement or link provided indicating that the source code for the methodology described in this paper is publicly available.
Open Datasets	Yes	We used pretrained Omniglot embeddings from Kaiser et al. (2017). This is a particularly appropriate method for pretraining because such a contrastive loss optimization procedure (Hadsell et al., 2006) could be run online over the DND s contents, assuming some heuristic for determining neighbor status.
Dataset Splits	No	The paper describes training and evaluation episodes, and mentions that weights were frozen during evaluation, but it does not specify explicit dataset splits (e.g., percentages or counts for training, validation, and test sets).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper mentions the use of 'Tensorﬂow and Torch predecessors' and a 'DND library' in the acknowledgements, but it does not provide specific version numbers for these or any other software dependencies, making replication challenging.
Experiment Setup	No	The paper states that 'Hyperparameters were tuned for the basic L2RL model, and held ﬁxed for the other model variations,' but it does not provide specific values for these hyperparameters (e.g., learning rate, batch size, optimizer settings) in the main text.