reproducibilityindex.ai

Improving Intrinsic Exploration with Language Abstractions

Authors: Jesse Mu, Victor Zhong, Roberta Raileanu, Minqi Jiang, Noah Goodman, Tim Rocktäschel, Edward Grefenstette

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Across 13 challenging, procedurally-generated, sparse-reward tasks in the Mini Grid [8] and Mini Hack [41] environment suites, we show that language-parameterized exploration methods outperform their non-linguistic counterparts by 47 85%, especially in more abstract tasks with larger state and action spaces.
Researcher Affiliation	Collaboration	1Stanford University, 2University of Washington, 3Meta AI, 4University College London, 5Cohere
Pseudocode	Yes	Algorithm S1 in Appendix A describes how L-AMIGo trains in an asynchronous actor-critic framework, where the student and teacher are jointly trained from batches of experience collected from separate actor threads, as used in our experiments (see Section 6).
Open Source Code	Yes	Code included with supplementary material and will be made public upon acceptance, with a link in Appendix C (currently anonymized)
Open Datasets	Yes	We evaluate on the most challenging tasks in Mini Grid [8]...To add language, we use the complementary Baby AI platform [9]...Mini Hack [41] is a suite of procedurally-generated tasks...
Dataset Splits	No	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] In Appendices B, C, G
Hardware Specification	Yes	All experiments were run on a single NVIDIA A100 GPU for 7 days.
Software Dependencies	No	We evaluate L-AMIGo, AMIGo, L-Novel D, and Novel D, implemented in the Torch Beast [27] implementation of IMPALA [17], a common asynchronous actor-critic method.
Experiment Setup	Yes	for full model, training, and hyperparameter details, see Appendix C.