reproducibilityindex.ai

From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization

Authors: Krzysztof M. Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Vikas Sindhwani

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide theoretical results and test ASEBO advantages over other methods empirically by evaluating it on the set of reinforcement learning policy optimization tasks as well as functions from the recently open-sourced Nevergrad library.
Researcher Affiliation	Collaboration	Krzysztof Choromanski Google Brain Robotics kchoro@google.com Aldo Pacchiano UC Berkeley pacchiano@berkeley.edu Jack Parker-Holder University of Oxford jackph@robots.ox.ac.uk Yunhao Tang Columbia University yt2541@columbia.edu Vikas Sindhwani Google Brain Robotics sindhwani@google.com
Pseudocode	Yes	Algorithm 1 ASEBO Algorithm Hyperparameters: number of iterations of full sampling l, smoothing parameter σ > 0, step size , PCA threshold , decay rate γ, total number of iterations T. Input: blackbox function F, vector 0 2 Rd where optimization starts. Cov0 2 {0}d d, p0 = 0. Output: vector T . for t = 0, . . . , T 1 do ... Algorithm 2 Explore estimator via exponentiated sampling Hyperparameters: smoothing parameter σ, horizon C, learning rate , probability regularizer β, initial probability parameter qt 0 2 (0, 1). Input: subspaces: LES active, LES,? active, function F, vector t Output: for l = 1, , C + 1 do ...
Open Source Code	No	The paper mentions open-source implementations of other methods used for comparison (pycma, ARS) but does not provide a link or statement for the open-sourcing of ASEBO's own code.
Open Datasets	Yes	We used the following environments from the Open AI Gym library: Swimmer-v2, Half Cheetahv2, Walker2d-v2, Reacher-v2, Pusher-v2 and Thrower-v2.
Dataset Splits	No	No explicit details about training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) are provided for the OpenAI Gym environments or Nevergrad functions.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned.
Software Dependencies	No	The paper mentions various software components and implementations (e.g., Adam, pycma, ARS, OpenAI baselines for PPO/TRPO) but does not provide specific version numbers for any of them.
Experiment Setup	Yes	In all experiments we used policies encoded by neural network architectures of two hidden layers and with tanh nonlinearities, with > 100 parameters. For gradient-based optimization we use Adam. In practice one can setup the hyperparameters used by Algorithm 2 as follows: σ = 0.01, C = 10, = 0.01, β = 0.1, qt 0 = 0.1.