reproducibilityindex.ai

Reverse Forward Curriculum Learning for Extreme Sample and Demo Efficiency

Authors: Stone Tao, Arth Shukla, Tse-kai Chan, Hao Su

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We rigorously evaluate RFCL against several state-of-the-art baselines across 21 fully-observable manipulation tasks from 3 benchmarks: Adroit, Mani Skill2, and Meta World (Rajeswaran et al., 2018; Gu et al., 2023; Yu et al., 2019).
Researcher Affiliation	Academia	Stone Tao & Arth Shukla & Tse-kai Chan & Hao Su University of California, San Diego {stao, arshukla, tsc003, haosu}@ucsd.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Website with code and visualizations are here: https://reverseforward-cl.github.io/All code is open sourced on Github and are excited with how far the community can push when leveraging more properties of simulation.All experiments on the RFCL method can be reproduced with given runnable scripts + docker images uploaded on https://github.com/stonet2000/rfcl
Open Datasets	Yes	We rigorously evaluate RFCL against several state-of-the-art baselines across 21 fully-observable manipulation tasks from 3 benchmarks: Adroit, Mani Skill2, and Meta World (Rajeswaran et al., 2018; Gu et al., 2023; Yu et al., 2019).
Dataset Splits	No	The paper does not explicitly mention training/test/validation dataset splits or cross-validation setup in the way described for supervised learning datasets. It refers to training interaction steps and evaluations.
Hardware Specification	Yes	Experiments all ran on a RTX 2080 GPU.
Software Dependencies	No	The paper mentions software like Soft Actor Critic, but does not specify versions for programming languages (e.g., Python), libraries (e.g., PyTorch), or other software components used for the experiments.
Experiment Setup	Yes	Table 5: RFCL sample-efficient variation of hyperparameters. These are the ones used to generate all figures and results. Highlighted in blue indicates hyperparameters introduced by this paper, which are for the automatic construction of reverse and forward curriculums. The non-highlighted hyperparameters are standard ones used in Soft-Actor-Critic with a Q-ensemble or Prioritized Level Replay (PLR).