An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search
Authors: Kyunghyun Lee, Byeong-Uk Lee, Ukcheol Shin, In So Kweon
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed framework and update methods are evaluated in continuous control benchmark work, showing superior performance as well as time efficiency compared to the previous methods. and We evaluate the algorithms in several simulated environments which are commonly used as benchmarks in policy search: Half Cheetah-v2, Hopper-v2, Walker2d-v2, Swimmer-v2, Ant-v2, and Humanoid-v2 [36]. The presented statistics were calculated and averaged over 10 runs with the same configuration. |
| Researcher Affiliation | Academia | Kyunghyun Lee Byeong-Uk Lee Ukcheol Shin In So Kweon Korea Advanced Institute of Science and Technology (KAIST) Daejeon, Korea {kyunghyun.lee, byeonguk.lee, shinwc159, iskweon77}@kaist.ac.kr |
| Pseudocode | Yes | A pseudocode for the whole algorithm is described in Appendix B. and Our overall algorithm pseudo-code is prested in Appendix B |
| Open Source Code | Yes | The source code of our implementation is available at https://github.com/KyunghyunLee/aes-rl |
| Open Datasets | Yes | We evaluate the algorithms in several simulated environments which are commonly used as benchmarks in policy search: Half Cheetah-v2, Hopper-v2, Walker2d-v2, Swimmer-v2, Ant-v2, and Humanoid-v2 [36]. |
| Dataset Splits | No | The paper mentions evaluating algorithms in various simulated environments, but it does not explicitly provide details about training, validation, or test dataset splits (e.g., percentages or counts). |
| Hardware Specification | Yes | All three algorithms are evaluated in Half CHeetah-v2, which has the fixed episode steps, and Walker2d-v2 and Hopper-v2, which has varying episode steps, in the same hardware configuration, two Ethernet-connected machines of Intel i7-6800k and three NVidia Ge Force 1080Ti; a total of 24 CPU cores and 6 GPUs. |
| Software Dependencies | No | The paper mentions using TD3 [5] in the RL part, but it does not provide specific version numbers for any software components, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | Detailed architecture and hyperparameters for all methods are shown in Appendix A and C, respectively. |