Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies
Authors: Oscar Li, James Harrison, Jascha Sohl-Dickstein, Virginia Smith, Luke Metz
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally3, we show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of unroll steps across a variety of applications, including learning dynamical systems, meta-training learned optimizers, and reinforcement learning. |
| Researcher Affiliation | Collaboration | Oscar Li 1, James Harrison , Jascha Sohl-Dickstein , Virginia Smith , Luke Metz 2 Machine Learning Department, School of Computer Science Carnegie Mellon University Google Deep Mind 1Correspondence to: EMAIL. 2Now at Open AI. |
| Pseudocode | Yes | Algorithm 1 Persistent Evolution Strategies [15] class PESWorker(Online ESWorker): |
| Open Source Code | Yes | Code available at https://github.com/Oscarcar Li/Noise-Reuse-Evolution-Strategies. |
| Open Datasets | Yes | We consider meta-training the learned optimizer model given in [3] (d = 1762) to optimize a 3-layer MLP on the Fashion MNIST dataset for T = 1000 steps. |
| Dataset Splits | No | The paper mentions using a 'sampled validation batch' but does not provide specific details on how this validation set is created or its size in relation to overall dataset splits, which is necessary for reproducibility. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) were provided for running experiments, only vague mentions like 'same hardware'. |
| Software Dependencies | No | The paper mentions software like Tensorflow [34], Pytorch [35], and JAX [49] but does not provide specific version numbers for these or other ancillary software components. |
| Experiment Setup | Yes | Hence, we take extra care in tuning each method s constant learning rate and additionally allow PES to have a decay schedule. We plot the convergence of different ES gradient estimators in wall-clock time using the same hardware in Figure 5(b). (We additionally compare against automatic differentiation methods in Figure 9 in the Appendix; they all perform worse than the ES methods shown here.) |