Learning to Guide Random Search
Authors: Ozan Sener, Vladlen Koltun
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate the method on continuous optimization benchmarks and high-dimensional continuous control problems. Our method achieves significantly lower sample complexity than Augmented Random Search, Bayesian optimization, covariance matrix adaptation (CMA-ES), and other derivative-free optimization algorithms. We conduct extensive experiments on continuous control problems, continuous optimization benchmarks, and gradient-free optimization of an airfoil. |
| Researcher Affiliation | Industry | Ozan Sener Intel Labs Vladlen Koltun Intel Labs |
| Pseudocode | Yes | Algorithm 1 Random Search; Algorithm 2 Manifold Random Search; Algorithm 3 Learned Manifold Random Search (LMRS) |
| Open Source Code | Yes | A full implementation is available at https://github.com/intel-isl/LMRS. |
| Open Datasets | Yes | We use the Mu Jo Co simulator (Todorov et al., 2012) to evaluate our method on high-dimensional control problems. We use 46 single-objective unconstrained functions from the Pagmo suite of continuous optimization benchmarks (Biscani et al., 2019). We use the XFoil simulator (Drela, 1989) to benchmark gradient-free optimization of an airfoil. |
| Dataset Splits | No | The paper discusses running experiments and averaging results over multiple random seeds, but it does not specify explicit training, validation, and test dataset splits with percentages or sample counts. |
| Hardware Specification | Yes | Measurements are performed on Intel Xeon E7-8890 v3 processors and Nvidia Ge Force RTX 2080 Ti GPUs. |
| Software Dependencies | No | The paper mentions software like 'Mu Jo Co simulator', 'Pagmo suite', 'XFoil simulator', 'pycma', and 'GPy Torch', but it does not provide specific version numbers for any of these. |
| Experiment Setup | Yes | We use linear policies and include all the tricks (whitening the observation space and scaling the step size using the variance of the rewards) from Mania et al. (2018). We use grid search over δ and n = k values and choose the best performing one in all experiments. We initialize our models with standard normal distributions. We use online gradient descent to learn the model parameters using SGD with momentum as 0.9. We also perform grid search for learning rate over {1e 4, 1e 3, 1e 2}. We set λ = 103 for all experiments. We initialize all solutions with zero mean unit variance normal variables and use grid search over δ {1e 4, 1e 3, 1e 2, 1e 1}, k {2, 5, 10, 50}, and α {1e 4, 1e 3, 1e 2, 1e 1}. |