Generalizing Gaussian Smoothing for Random Search
Authors: Katelyn Gao, Ozan Sener
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct evaluations of the three sampling distributions on linear regression, reinforcement learning, and DFO benchmarks in order to validate our claims. Our proposal improves on GS with the same computational complexity, and are competitive with and usually outperform Guided ES (Maheswaranathan et al., 2019) and Orthogonal ES (Choromanski et al., 2018), two computationally more expensive algorithms that adapt the covariance matrix of normally distributed perturbations. |
| Researcher Affiliation | Industry | 1Intel Labs, Santa Clara, CA, USA 2Intel Labs, Munich, Germany. Correspondence to: Katelyn Gao <katelyn.gao@intel.com>. |
| Pseudocode | No | The paper describes algorithms and formulas in text, but it does not include formal pseudocode blocks or clearly labeled algorithm boxes. |
| Open Source Code | No | The paper does not include any explicit statement about releasing source code or provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | We first sample 1000 data points from (9) to serve as the test set and initialize the parameters θ by sampling from N(0, I). There are 100 rounds. Each round consists of i) 10 optimization iterations of SGD with the gradient estimated from (4) on N newly sampled data points from (9) and L newly sampled directions ϵl ii) computation of the squared error loss over the test set. |
| Dataset Splits | No | The paper describes data generation and test set sampling for linear regression, and dynamic trajectory generation for RL, but does not specify explicit train/validation/test splits with percentages or sample counts for any fixed dataset. |
| Hardware Specification | Yes | Table 2 displays the average computation time required to sample the directions in each optimization iteration for each algorithm on Half Cheetah Rand Vel, using L Intel Xeon E7-8890 v3 CPUs; these numbers only depend on the parameter dimension d. |
| Software Dependencies | No | The paper mentions software tools like 'Open AI Gym', 'Mu Jo Co physics simulator', and 'Nevergrad', but it does not specify version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The optimizer is SGD using the gradient estimator (4) (or in (iii), its antithetic version). The learning rate and spacing c are chosen by grid search, to maximize the test performance at the end of optimization. Hyperparameters are the spacing c, chosen from {0.01, 0.1}, and the SGD learning rate η, chosen from {0.001, 0.01, 0.1}. The values chosen are the ones that minimize the test loss at the end of the 100 rounds, averaged over 3 randomly generated seeds different from those used in Figure 1. Tables 3 and 4 show the chosen hyperparameters for each algorithm and combination of L and N. |