reproducibilityindex.ai

Variance-Reduced Gradient Estimation via Noise-Reuse in Online Evolution Strategies

Authors: Oscar Li, James Harrison, Jascha Sohl-Dickstein, Virginia Smith, Luke Metz

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentally3, we show NRES results in faster convergence than existing AD and ES methods in terms of wall-clock time and number of unroll steps across a variety of applications, including learning dynamical systems, meta-training learned optimizers, and reinforcement learning.
Researcher Affiliation	Collaboration	Oscar Li 1, James Harrison , Jascha Sohl-Dickstein , Virginia Smith , Luke Metz 2 Machine Learning Department, School of Computer Science Carnegie Mellon University Google Deep Mind 1Correspondence to: oscarli@cmu.edu. 2Now at Open AI.
Pseudocode	Yes	Algorithm 1 Persistent Evolution Strategies [15] class PESWorker(Online ESWorker):
Open Source Code	Yes	Code available at https://github.com/Oscarcar Li/Noise-Reuse-Evolution-Strategies.
Open Datasets	Yes	We consider meta-training the learned optimizer model given in [3] (d = 1762) to optimize a 3-layer MLP on the Fashion MNIST dataset for T = 1000 steps.
Dataset Splits	No	The paper mentions using a 'sampled validation batch' but does not provide specific details on how this validation set is created or its size in relation to overall dataset splits, which is necessary for reproducibility.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) were provided for running experiments, only vague mentions like 'same hardware'.
Software Dependencies	No	The paper mentions software like Tensorflow [34], Pytorch [35], and JAX [49] but does not provide specific version numbers for these or other ancillary software components.
Experiment Setup	Yes	Hence, we take extra care in tuning each method s constant learning rate and additionally allow PES to have a decay schedule. We plot the convergence of different ES gradient estimators in wall-clock time using the same hardware in Figure 5(b). (We additionally compare against automatic differentiation methods in Figure 9 in the Appendix; they all perform worse than the ES methods shown here.)