Tell me why! Explanations support learning relational and causal structure

Authors: Andrew K Lampinen, Nicholas Roy, Ishita Dasgupta, Stephanie Cy Chan, Allison Tam, James Mcclelland, Chen Yan, Adam Santoro, Neil C Rabinowitz, Jane Wang, Felix Hill

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational and causal knowledge, augmenting their experience by training them to predict language descriptions and explanations can overcome these limitations. We show that language can help agents learn challenging relational tasks, and examine which aspects of language contribute to its benefits. We then show that explanations can help agents to infer not only relational but also causal structure.
Researcher Affiliation Industry 1Deep Mind, London, UK. Correspondence to: Andrew Lampinen <lampinen@deepmind.com>.
Pseudocode No The paper describes the agent architecture and training process in text and diagrams, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code No We are in the process of preparing our 2D environments for release; once this process is complete they will be released at https://github.com/deepmind/tell_me_why_explanations_rl
Open Datasets No The paper describes the creation of custom 2D and 3D RL environments for the tasks: "We instantiate these tasks in 2D and 3D RL environments (Fig. 3a)". It does not use or provide access information for a publicly available, pre-existing dataset.
Dataset Splits No The paper describes training and evaluation setups for its RL agents, including a "training and testing setup" and a "meta-learning setting where agents complete episodes composed of four odd-one-out trials". However, it does not provide specific percentages or counts for traditional training, validation, or test dataset splits in the manner of supervised learning.
Hardware Specification Yes All agents were implemented using JAX (Bradbury et al., 2018) and Haiku (Hennigan et al., 2020), and were trained using TPU v3 and v4 devices.
Software Dependencies Yes All agents were implemented using JAX (Bradbury et al., 2018) and Haiku (Hennigan et al., 2020)
Experiment Setup Yes In Table 2 we list the architectural and hyperparameters used for the main experiments.