reproducibilityindex.ai

An Efficient Asynchronous Method for Integrating Evolutionary and Gradient-based Policy Search

Authors: Kyunghyun Lee, Byeong-Uk Lee, Ukcheol Shin, In So Kweon

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed framework and update methods are evaluated in continuous control benchmark work, showing superior performance as well as time efﬁciency compared to the previous methods. and We evaluate the algorithms in several simulated environments which are commonly used as benchmarks in policy search: Half Cheetah-v2, Hopper-v2, Walker2d-v2, Swimmer-v2, Ant-v2, and Humanoid-v2 [36]. The presented statistics were calculated and averaged over 10 runs with the same conﬁguration.
Researcher Affiliation	Academia	Kyunghyun Lee Byeong-Uk Lee Ukcheol Shin In So Kweon Korea Advanced Institute of Science and Technology (KAIST) Daejeon, Korea {kyunghyun.lee, byeonguk.lee, shinwc159, iskweon77}@kaist.ac.kr
Pseudocode	Yes	A pseudocode for the whole algorithm is described in Appendix B. and Our overall algorithm pseudo-code is prested in Appendix B
Open Source Code	Yes	The source code of our implementation is available at https://github.com/KyunghyunLee/aes-rl
Open Datasets	Yes	We evaluate the algorithms in several simulated environments which are commonly used as benchmarks in policy search: Half Cheetah-v2, Hopper-v2, Walker2d-v2, Swimmer-v2, Ant-v2, and Humanoid-v2 [36].
Dataset Splits	No	The paper mentions evaluating algorithms in various simulated environments, but it does not explicitly provide details about training, validation, or test dataset splits (e.g., percentages or counts).
Hardware Specification	Yes	All three algorithms are evaluated in Half CHeetah-v2, which has the ﬁxed episode steps, and Walker2d-v2 and Hopper-v2, which has varying episode steps, in the same hardware conﬁguration, two Ethernet-connected machines of Intel i7-6800k and three NVidia Ge Force 1080Ti; a total of 24 CPU cores and 6 GPUs.
Software Dependencies	No	The paper mentions using TD3 [5] in the RL part, but it does not provide specific version numbers for any software components, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	Detailed architecture and hyperparameters for all methods are shown in Appendix A and C, respectively.