Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single

Authors: Paul Vicol

ICML 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated ES-Single on a diverse set of tasks, from synthetic problems designed to test unbiasedness, to hyperparameter optimization, RNN training, and meta-training learned optimizers. We found that ES-Single outperformed PES across all tasks we investigated.
Researcher Affiliation	Industry	1Google Brain. Correspondence to: Paul Vicol <EMAIL>.
Pseudocode	Yes	Algorithm 2 ES with a single perturbation per particle reapplied in each truncated unroll (ES-Single).
Open Source Code	Yes	We provide JAX code for ES-Single in Appendix H, and a Colab notebook implementation here.
Open Datasets	Yes	We consider a tiny LSTM trained on the character-level Penn Tree Bank dataset (Marcus et al., 1993). We also revisited the UCI linear regression task used in Vicol et al. (2021), which demonstrates that truncation bias can also affect regularization hyperparameters... on the UCI Yacht dataset (Asuncion & Newman, 2007). Here, we used ES-Single to meta-learn a learning rate (LR) schedule used to train an MLP on MNIST. ...tuning the learning rate and decay factor for training an MLP on Fashion MNIST... train a Res Net on CIFAR-10.
Dataset Splits	No	The paper mentions using a 'validation set' and 'sum of validation losses' as meta-objectives, for example, 'The meta-objective is the sum of validation losses over the inner problem.' However, it does not specify exact split percentages or absolute sample counts for train/validation/test datasets.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions using 'JAX code' in Appendix H but does not specify a version number for JAX or any other software dependencies used in the experiments.
Experiment Setup	Yes	For all approaches (vanilla ES, PES, and ES-Single), we use antithetic sampling. For all methods, we performed outer optimization using Adam with learning rate 0.01. The total inner problem length is T = 5000, which is split into 500 partial unrolls of length K = 10. We used N = 1000 particles and σ = 0.1 for each estimator, and we used Adam with learning rate 1e-2 for outer optimization.