reproducibilityindex.ai

Evolutionary Gradient Descent for Non-convex Optimization

Authors: Ke Xue, Chao Qian, Ling Xu, Xudong Fei

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prove that EGD can converge to a second-order stationary point by escaping the saddle points, and is more efﬁcient than previous algorithms. Empirical results on non-convex synthetic functions as well as reinforcement learning (RL) tasks also show its superiority.
Researcher Affiliation	Collaboration	Ke Xue1 , Chao Qian1 , Ling Xu2 and Xudong Fei2 1State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China 22012 Lab, Huawei Technologies, Shenzhen 518000, China
Pseudocode	Yes	Algorithm 1 PGD algorithm; Algorithm 2 EGD algorithm; Algorithm 3 Mutation and Selection
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	First, we compare these algorithms on three synthetic functions: Octopus, Ackley and Schwefel. ... Next, we examine the performance of EGD on four Mu Jo Co locomotion tasks: Swimmer-v2, Hopper-v2, Half Cheetah-v2 and Ant-v2 [Todorov et al., 2012].
Dataset Splits	No	The paper mentions using synthetic functions and RL tasks for experiments but does not provide specific details on how datasets were split into training, validation, and test sets.
Hardware Specification	No	The paper does not specify any details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software libraries, frameworks, or environments used in the experiments.
Experiment Setup	Yes	We use identical random seeds (2017, 2018, 2019, 2020, 2021) for all tasks and algorithms. The population size N is always set to 5. To make fair comparisons on each task, Multi-GD, Multi-PGD, ESGD and EGD employ the same learning rate η; (N + N)-EA, Multi-PGD and ESGD employ the same mutation strength r, while the N mutation strengths {r(p)}N p=1 of EGD are set to uniformly discretized values between r and 1.2r (or 1.5r); the parameters L and ϵ of Multi PGD and EGD are set to the same.