Trust Region Evolution Strategies

Authors: Guoqing Liu, Li Zhao, Feidiao Yang, Jiang Bian, Tao Qin, Nenghai Yu, Tie-Yan Liu4352-4359

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate the effectiveness of TRES on a range of popular Mu Jo Co locomotion tasks in the Open AI Gym, achieving better performance than ES algorithm.
Researcher Affiliation Collaboration Guoqing Liu, Li Zhao, Feidiao Yang, Jiang Bian, Tao Qin, Nenghai Yu, Tie-Yan Liu University of Science and Technology of China Microsoft Research Institute of Computing Technology, Chinese Academy of Sciences lgq1001@mail.ustc.edu.cn; {lizo, jiang.bian, taoqin, tyliu}@microsoft.com; yangfeidiao@ict.ac.cn
Pseudocode Yes Algorithm 1 Trust Region Evolution Strategies
Open Source Code No Not found. The paper does not provide explicit statements about the release of its own source code or links to a repository.
Open Datasets Yes To demonstrate the effectiveness of TRES, we conducted experiments on the continuous Mu Jo Co locomotion tasks from the Open AI Gym (Brockman et al. 2016).
Dataset Splits No Not found. The paper describes training and evaluation but does not specify explicit training/validation/test dataset splits by percentage or sample count.
Hardware Specification No Not found. The paper does not specify the exact hardware (e.g., specific CPU or GPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No Not found. The paper mentions software like 'Open AI evolution-strategies-starter code' and 'Open AI Gym' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Input: noise standard deviation σ, clip factor λ, epoch number K, learning rate α... We conducted several experiments to investigate their impact. (λ and K)... We observe that K = 15 can gain best performance.