Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Teachable Reinforcement Learning via Advice Distillation

Authors: Olivia Watkins, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Jacob Andreas

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In puzzle-solving, navigation, and locomotion domains, we show that agents that learn from advice can acquire new skills with signiﬁcantly less human supervision than standard reinforcement learning algorithms and often less than imitation learning.
Researcher Affiliation	Academia	Olivia Watkins UC Berkeley EMAIL Trevor Darrell UC Berkeley EMAIL Pieter Abbeel UC Berkeley EMAIL Jacob Andreas MIT EMAIL Abhishek Gupta UC Berkeley EMAIL
Pseudocode	No	The paper describes the algorithms in prose and uses diagrams (e.g., Figure 2) but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experi-mental results (either in the supplemental material or as a URL)? [Yes] See Appendix A for link to URL and run instructions in the README in the github repo.
Open Datasets	Yes	Baby AI: In the open-source Baby AI [8] grid-world... Ant-Maze Navigation (Ant): The open-source ant-maze navigation domain [14] replaces the simple point mass agent... Envs we used are cited in section 4.1
Dataset Splits	Yes	The details of the exact set of training and testing tasks, as well as architecture and algorithmic details, are provided in the appendix. Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.
Hardware Specification	Yes	Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix A
Software Dependencies	Yes	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.
Experiment Setup	Yes	The details of the exact set of training and testing tasks, as well as architecture and algorithmic details, are provided in the appendix. Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Appendix A.