C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

Authors: Tianjun Zhang, Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine, Joseph E. Gonzalez

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate that our method is more sample efficient that prior methods. Moreover, it is able to solve very long horizons manipulation and navigation tasks, tasks that prior goalconditioned methods and methods based on graph search fail to solve. 5 EXPERIMENTS Our experiments study whether C-Planning can compete with prior goal-conditioned RL methods both on benchmark tasks and on tasks designed to pose a significant planning and exploration challenge.
Researcher Affiliation Academia Tianjun Zhang UC Berkeley tianjunz@berkeley.edu Benjamin Eysenbach Carnegie Mellon University beysenba@cs.cmu.edu Ruslan Salakhutdinov Carnegie Mellon University Sergey Levine UC Berkeley Joseph E. Gonzalez UC Berkeley
Pseudocode Yes Algorithm 1 C-Planning performs planning in data collection, modifies C-learning by L5 L6 7. The update for the policy and classifier (L9) is the same. Algorithm 2 C-Planning samples the intermediate waypoints, then command the agent to reach them.
Open Source Code Yes Our code is available at https://github.com/tianjunz/c-planning.
Open Datasets Yes The first set of environments is taken from the Metaworld suite (Yu et al., 2020), a common benchmark for goal-conditioned RL.
Dataset Splits No The paper describes environments and tasks (Metaworld, 2D mazes) and discusses data collection during training, but does not specify explicit training, validation, and test dataset splits with percentages or counts.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud computing specifications used for running experiments.
Software Dependencies No The paper refers to using the SAC algorithm and building upon C-learning, but does not provide specific version numbers for any software libraries, frameworks, or dependencies used in the implementation.
Experiment Setup Yes We provide the essential hyperparameters for reproducing our experiments in this section. We also introduce the hyperparameters used in baselines and provide a detailed description of environmental design. Table 1: Hyperparameters used for C-Planning in all the environments in Meta World. Table 2: Hyperparameters used for C-Planning in all the environments in 2D navigation maze.