Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Reducing Reparameterization Gradient Variance

Authors: Andrew Miller, Nick Foti, Alexander D'Amour, Ryan P. Adams

NeurIPS 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our approach on a non-conjugate hierarchical model and a Bayesian neural net where our method attained orders of magnitude (20-2,000 ) reduction in gradient variance resulting in faster and more stable optimization.
Researcher Affiliation Collaboration Andrew C. Miller Harvard University EMAIL Nicholas J. Foti University of Washington EMAIL Alexander D Amour UC Berkeley EMAIL Ryan P. Adams Google Brain and Princeton University EMAIL
Pseudocode Yes Algorithm 1 Gradient descent with RV-RGE with a diagonal Gaussian variational family
Open Source Code Yes Code is available at https://github.com/andymiller/ReducedVarianceRepGrads.
Open Datasets Yes Bayesian Neural Network: The non-conjugate bnn model is a Bayesian neural network applied to the wine dataset, (see Appendix C.2)
Dataset Splits No The paper uses the terms 'train', 'validation', and 'test' in general contexts but does not provide specific data split information (exact percentages, sample counts, or detailed methodology) for reproducing the data partitioning.
Hardware Specification No The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software like 'adam [13]', 'TensorFlow [1]', 'Pytorch [20]', and 'Autograd [15]', but it does not provide specific version numbers for these software components or libraries.
Experiment Setup Yes We compare the progress of the adam algorithm using various numbers of samples [13], fixing the learning rate.