Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning
Authors: Yutong Wang, Ke Xue, Chao Qian
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various (i.e., deceptive and multi-modal) continuous control tasks, show the superior performance of EDO-CS over previous methods, i.e., EDO-CS can achieve a set of policies with both high quality and diversity efficiently while previous methods cannot. |
| Researcher Affiliation | Academia | Yutong Wang , Ke Xue and Chao Qian State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China {wangyt, xuek, qianc}@lamda.nju.edu.cn |
| Pseudocode | Yes | Algorithm 1: EDO-CS |
| Open Source Code | No | The paper does not provide an explicit statement or link to its open-source code. |
| Open Datasets | Yes | To examine the performance of EDO-CS, we conduct experiments on a variety of continuous control tasks from Open AI Gym library (Brockman et al., 2016). |
| Dataset Splits | No | The paper mentions training and testing environments but does not explicitly describe a validation dataset split or its usage. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as CPU or GPU models. |
| Software Dependencies | No | The paper mentions 'Open AI Gym library (Brockman et al., 2016)' and 'Mu Jo Co environments' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For EDO-CS, the number M of candidates for selection in each cluster is set to 2, and the arms of the bandit are {λ(1) = 0, λ(2) = 0.5}. Other parameter settings can be found in Appendix A.1. (Appendix A.1 provides tables with specific numerical values for Population size M, Archive size l, Number T of updating iterations, σ in ES, and η in ES for different environments.) |