reproducibilityindex.ai

Autonomous Reinforcement Learning via Subgoal Curricula

Authors: Archit Sharma, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark Va PRL on several robotic control tasks in the persistent RL setting against state-of-the-art methods, which either simulate the initial state distribution by learning a reset controller, or incrementally grow the state-space from which the given task can be solved. Our experiments indicate that using a tailored curriculum generated by Va PRL can be up to 30% more sample-efﬁcient in acquiring task behaviors compared to these prior methods.
Researcher Affiliation	Collaboration	Archit Sharma , Abhishek Gupta# , Sergey Levine# , Karol Hausman , Chelsea Finn Stanford University, Google Brain, # UC Berkeley {architsh,cbfinn}@stanford.edu {abhishekunique,slevine,karolhausman}@google.com
Pseudocode	Yes	Algorithm 1: Value-Accelerated Persistent Reinforcement Learning (Va PRL)
Open Source Code	No	In the ethics checklist, the authors state: '[No], we will release the code and the environments upon publication.' The provided URL is a project page, not a direct code repository.
Open Datasets	No	The paper describes using simulated environments (table-top rearrangement, sawyer door closing, hand manipulation) and providing 'a small set of trajectories' or 'demonstrations' to the algorithms. It does not provide concrete access information (link, DOI, formal citation) to a publicly available dataset used for training. While it references 'Meta-world', it does not specify how the data used from it is publicly accessed.
Dataset Splits	No	The paper discusses 'training environment MT' and 'evaluation environment ME' but does not explicitly mention or provide details about a 'validation' dataset or specific training/validation/test data splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification	No	The main body of the paper does not specify the hardware used (e.g., GPU models, CPU types, memory). While the ethics checklist indicates this information is in the Appendix, the Appendix itself is not provided in the analyzed text.
Software Dependencies	No	The paper mentions using 'soft actor-critic [17] as the base RL algorithm' and refers to 'Tensorﬂow agents [18]', but it does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	No	The paper states: 'Further details about problem setup, demonstrations, implementation, hyperparameters and evaluation metrics can be found in the Appendix.' This indicates that specific experimental setup details, such as hyperparameters, are not present in the main text provided.