reproducibilityindex.ai

Noisy Derivative-Free Optimization With Value Suppression

Authors: Hong Wang, Hong Qian, Yang Yu

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On synthetic problems as well as reinforcement learning tasks, experiments verify that value suppression can be signiﬁcantly more effective than the previous methods.
Researcher Affiliation	Academia	Hong Wang, Hong Qian, Yang Yu National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China waghon@outlook.com, {qianh,yuy}@lamda.nju.edu.cn
Pseudocode	Yes	Algorithm 1 Value Suppression Framework for Derivative Free Optimization; Algorithm 2 SRACOS; Algorithm 3 Suppressed SRACOS (SSRACOS)
Open Source Code	No	The paper does not contain an explicit statement about the release of its source code or a direct link to a code repository.
Open Datasets	Yes	We conduct experiments on two synthetic functions and controlling tasks of reinforcement learning in Open AI Gym. Open AI Gym provides a toolkit for reinforcement learning research (https://gym.openai.com). There are many controlling tasks, from which we choose Acrobot, Mountain Car, Half Cheetah, Humanoid, Swimmer, Ant, Hopper, and Lunar Lander to compare the ability of reducing the effects of noise for each noise handling mechanism.
Dataset Splits	No	The paper does not specify traditional training, validation, or test dataset splits. For reinforcement learning tasks, policies are evaluated through simulations in environments rather than on static dataset splits. For synthetic functions, it mentions the total number of function evaluations.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'Open AI Gym' but does not specify its version or any other software libraries and their version numbers used in the experiments.
Experiment Setup	Yes	The parameters of these noise handling mechanisms are set as follows. For sampling, the sample size is set to be 10. For threshold selection, we set the threshold value τ = σ. For value suppression, we set the maximum allowed non-update iterations u = 500, the re-sample size v = 100, and the balance parameter α = 0.5. The settings of neural network and Open AI Gym tasks is listed in Table 2, where d State, #Actions, NN nodes, #Weights and Horizon denote the dimension size of observation, the dimension size of action, the hidden layers of the neural network, the total number of parameters in the neural network and the maximum step, respectively. We compare these mechanisms under the same parameter setting of SRACOS, which is listed in Table 3, where #B and #B+ denote the size of negative set and positive set respectively, and U-bits denotes the number of bits that can be changed when generating a new solution from a positive solution.