Semantic Exploration from Language Abstractions and Pretrained Representations

Authors: Allison Tam, Neil Rabinowitz, Andrew Lampinen, Nicholas A. Roy, Stephanie Chan, DJ Strouse, Jane Wang, Andrea Banino, Felix Hill

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that these pretrained representations drive meaningful, task-relevant exploration and improve performance on 3D simulated environments. We also characterize why and how language provides useful abstractions for exploration by considering the impacts of using representations from a pretrained model, a language oracle, and several ablations. ... Our results show that language-based exploration with pretrained visionlanguage representations improves sample efficiency on Playroom tasks by 18-70%. It also doubles the visited areas in City, compared to baseline methods.
Researcher Affiliation Industry Allison C. Tam Deep Mind London, UK actam@deepmind.com Neil C. Rabinowitz Deep Mind London, UK ncr@deepmind.com Andrew K. Lampinen Deep Mind London, UK lampinen@deepmind.com Nicholas A. Roy Deep Mind London, UK nroy@deepmind.com Stephanie C. Y. Chan Deep Mind London, UK scychan@deepmind.com Deep Mind London, UK strouse@deepmind.com Jane X. Wang Deep Mind London, UK wangjane@deepmind.com Andrea Banino Deep Mind London, UK abanino@deepmind.com Felix Hill Deep Mind London, UK felixhill@deepmind.com
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not include an unambiguous statement or a direct link for the release of their own source code. It refers to existing models and environments but not the code for their specific methodology.
Open Datasets Yes In this paper, we focus on first-person Unity-based 3D environments that are meant to mimic familiar scenes from the real world (Figure 2). Playroom Our first domain, Playroom [1, 52], is a randomly-generated house containing everyday household items... Our second domain, City, is an expansive, large-scale urban environment.
Dataset Splits No The paper mentions environments like Playroom and City but does not provide specific details on training, validation, or test dataset splits (e.g., percentages or sample counts). It states 'Hyperparameters and additional details are found in Appendix A.', but this appendix is not provided in the snippet.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It mentions using Unity-based 3D environments and various RL algorithms but not the underlying hardware.
Software Dependencies No The paper mentions using specific algorithms and models (e.g., Impala, R2D2, BERT, CLIP) and describes components like ResNet and LSTM, but it does not provide specific version numbers for any software dependencies, libraries, or frameworks used in their implementation.
Experiment Setup Yes For both environments, the agent architecture consists of an image Res Net encoder and a language LSTM encoder that feed into a memory LSTM module. The policy and value heads are MLPs that receive the memory state as input. If the exploration method requires additional networks, such as the trainable network in RND or inverse dynamics model in NGU, they do not share any parameters with the policy or value networks. Figure S2 is a visualization of an Impala agent that uses language-augmented exploration. Hyperparameters and additional details are found in Appendix A.