Self-Guided Evolution Strategies with Historical Estimated Gradients

Authors: Fei-Yu Liu, Zi-Niu Li, Chao Qian

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results on benchmark black-box functions and a set of popular RL tasks exhibit the superior performance of SGES over state-of-the-art ES algorithms.
Researcher Affiliation Collaboration Fei-Yu Liu1,2 , Zi-Niu Li3 and Chao Qian1 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China 2University of Science and Technology of China, Hefei 230027, China 3Polixir, Nanjing 210038, China
Pseudocode Yes Algorithm 1 SGES Algorithm
Open Source Code No The paper refers to open-source implementations of other algorithms (CMA-ES, ASEBO, Guided ES, Nevergrad) that were used for comparison, but does not provide a link or explicit statement about the open-sourcing of the SGES algorithm itself.
Open Datasets Yes black-box functions from the recently open-sourced Nevergrad library [Rapin and Teytaud, 2018], and the continuous Mu Jo Co locomotion tasks (which are widely studied in the RL community) from the Open AI Gym library [Brockman et al., 2016].
Dataset Splits No The paper does not explicitly provide training/test/validation dataset splits (e.g., percentages, sample counts, or citations to predefined splits for their experiments).
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using libraries like Nevergrad and Open AI Gym but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For these two types of tasks, the warmup iterations Tw of SGES is set to k and 2k, respectively. For fair comparisons, we use identical random seeds (2016, 2017, 2018, 2019, and 2020) for all tasks and algorithms. The initial value of α in SGES is set to 0.5. The smoothing parameter σ is set to a small value 0.01. The learning rate η is chosen properly from {0.5, 0.1, 0.01, 0.001}. For all test functions, k is set to 20. For each task, we choose the learning rate η, the smoothing parameter σ, and the sample size P, as recommended by Mania et al. [2018]; the gradient subspace dimension k is set to about half of P; the common hyperparameters of all algorithms are set to identical values.