Automatic Curriculum Learning through Value Disagreement
Authors: Yunzhi Zhang, Pieter Abbeel, Lerrel Pinto
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods. |
| Researcher Affiliation | Academia | Yunzhi Zhang UC Berkeley Pieter Abbeel UC Berkeley Lerrel Pinto UC Berkeley, NYU |
| Pseudocode | Yes | Algorithm 1 Curriculum Learning with Value Disagreement Sampling |
| Open Source Code | Yes | Code is publicly available at https://github.com/zzyunzhi/vds. |
| Open Datasets | Yes | We test our methods on 13 manipulation goal-conditioned tasks, 3 maze navigation tasks and 2 Ant-embodied navigation tasks, all with sparse reward, as shown in Figure 2. Detailed setup of the environments is presented in Appendix C. |
| Dataset Splits | No | The paper discusses 'training goals' and 'evaluation goals' but does not specify a distinct 'validation' dataset split for hyperparameter tuning or early stopping. |
| Hardware Specification | Yes | In fact, all of our experiments using VDS are run on a single CPU. |
| Software Dependencies | No | The paper mentions algorithms like DDPG and HER, but does not provide specific version numbers for software libraries, programming languages, or other dependencies. |
| Experiment Setup | Yes | Detailed hyper-parameter settings are speciļ¬ed in the Appendix D |