From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization
Authors: Krzysztof M. Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Vikas Sindhwani
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide theoretical results and test ASEBO advantages over other methods empirically by evaluating it on the set of reinforcement learning policy optimization tasks as well as functions from the recently open-sourced Nevergrad library. |
| Researcher Affiliation | Collaboration | Krzysztof Choromanski Google Brain Robotics kchoro@google.com Aldo Pacchiano UC Berkeley pacchiano@berkeley.edu Jack Parker-Holder University of Oxford jackph@robots.ox.ac.uk Yunhao Tang Columbia University yt2541@columbia.edu Vikas Sindhwani Google Brain Robotics sindhwani@google.com |
| Pseudocode | Yes | Algorithm 1 ASEBO Algorithm Hyperparameters: number of iterations of full sampling l, smoothing parameter σ > 0, step size , PCA threshold , decay rate γ, total number of iterations T. Input: blackbox function F, vector 0 2 Rd where optimization starts. Cov0 2 {0}d d, p0 = 0. Output: vector T . for t = 0, . . . , T 1 do ... Algorithm 2 Explore estimator via exponentiated sampling Hyperparameters: smoothing parameter σ, horizon C, learning rate , probability regularizer β, initial probability parameter qt 0 2 (0, 1). Input: subspaces: LES active, LES,? active, function F, vector t Output: for l = 1, , C + 1 do ... |
| Open Source Code | No | The paper mentions open-source implementations of other methods used for comparison (pycma, ARS) but does not provide a link or statement for the open-sourcing of ASEBO's own code. |
| Open Datasets | Yes | We used the following environments from the Open AI Gym library: Swimmer-v2, Half Cheetahv2, Walker2d-v2, Reacher-v2, Pusher-v2 and Thrower-v2. |
| Dataset Splits | No | No explicit details about training, validation, or test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) are provided for the OpenAI Gym environments or Nevergrad functions. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned. |
| Software Dependencies | No | The paper mentions various software components and implementations (e.g., Adam, pycma, ARS, OpenAI baselines for PPO/TRPO) but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | In all experiments we used policies encoded by neural network architectures of two hidden layers and with tanh nonlinearities, with > 100 parameters. For gradient-based optimization we use Adam. In practice one can setup the hyperparameters used by Algorithm 2 as follows: σ = 0.01, C = 10, = 0.01, β = 0.1, qt 0 = 0.1. |