reproducibilityindex.ai

Learning to Reach Goals via Iterated Supervised Learning

Authors: Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Manon Devin, Benjamin Eysenbach, Sergey Levine

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We formally show that this iterated supervised learning procedure optimizes a bound on the RL objective, derive performance bounds of the learned policy, and empirically demonstrate improved goal-reaching performance and robustness over current RL algorithms in several benchmark tasks.
Researcher Affiliation	Academia	Dibya Ghosh UC Berkeley Abhishek Gupta UC Berkeley Ashwin Reddy UC Berkeley Justin Fu UC Berkeley Coline Devin UC Berkeley Benjamin Eysenbach Carnegie Mellon University Sergey Levine UC Berkeley
Pseudocode	Yes	Algorithm 1 Goal-Conditioned Supervised Learning (GCSL)
Open Source Code	Yes	We have additionally open-sourced our implementation at https://github.com/dibyaghosh/gcsl.
Open Datasets	Yes	Lunar Lander (Brockman et al., 2016) This environment requires a rocket to land in a speciﬁed region.
Dataset Splits	No	The paper describes data collection and training procedures but does not explicitly detail training/validation/test dataset splits with percentages or sample counts for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments.
Software Dependencies	No	The paper mentions using the 'Adam optimizer' but does not provide version numbers for this or any other software dependencies, which are necessary for full reproducibility.
Experiment Setup	Yes	The GCSL loss is optimized using the Adam optimizer with learning rate α = 5 10 4, with a batch size of 256, taking one gradient step for every step in the environment.