Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Survey of Zero-shot Generalisation in Deep Reinforcement Learning

Authors: Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rocktäschel

JAIR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This survey is an overview of this nascent field. We rely on a unifying formalism and terminology for discussing different ZSG problems, building upon previous works. We go on to categorise existing benchmarks for ZSG, as well as current methods for tackling these problems. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work.
Researcher Affiliation Collaboration Robert Kirk EMAIL University College London, Gower St, London WC1E 6BT, United Kingdom Amy Zhang EMAIL University of California, Berkeley, Berkeley CA, United States Meta AI Research,
Pseudocode No The paper is a survey that describes methodologies and benchmarks from other research. It does not present its own new algorithms or methods in pseudocode or algorithm blocks. The text describes methods in natural language without structured steps formatted like code.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in *this* paper. It is a survey paper that refers to code releases and project websites of *other* research works (e.g., "Code and videos are available at https://nicklashansen.github.io/SODA/" referring to Hansen and Wang's work, and "Project site with videos and code: https://agarwl.github.io/pse" referring to Agarwal et al.'s work), but does not itself release code for the survey's analysis or framework.
Open Datasets No The paper is a survey and does not conduct its own experiments using a specific dataset. It refers to various existing benchmarks and environments (e.g., "Atari", "Mu Jo Co", "Open AI Procgen", "Distracting Control Suite") that may have associated datasets, but it does not use a dataset for its own analysis or provide access information for a dataset used in its own work. Although it mentions a GitHub link for datasets in the context of offline RL benchmarks (Gulcehre et al., 2021), this refers to *other* datasets discussed in the paper, not a dataset used by this paper for its own analysis.
Dataset Splits No The paper is a survey and does not conduct its own experiments or use a specific dataset for its analysis, thus it does not provide dataset split information. It describes how dataset splits are used in other benchmarks (e.g., "training on a fixed set of 200 levels and then evaluate performance on the full distribution of levels" for Procgen), but these are not splits used by the paper itself.
Hardware Specification No The paper is a survey of zero-shot generalization in deep reinforcement learning and does not conduct its own experiments. Therefore, it does not specify any hardware details (like GPU models, CPU types, or memory amounts) used for running experiments.
Software Dependencies No The paper is a survey and does not describe its own experimental methodology or implementation details that would require specific software dependencies with version numbers for replication. It discusses various algorithms and methods developed by others but does not provide details on its own software environment.
Experiment Setup No The paper is a survey of zero-shot generalization in deep reinforcement learning and does not describe its own experimental setup, hyperparameters, or training configurations. It analyzes and categorizes the experimental setups and methodologies of other research works but does not provide these details for its own content.