Reducing Reparameterization Gradient Variance

Authors: Andrew Miller, Nick Foti, Alexander D'Amour, Ryan P. Adams

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our approach on a non-conjugate hierarchical model and a Bayesian neural net where our method attained orders of magnitude (20-2,000 ) reduction in gradient variance resulting in faster and more stable optimization.
Researcher Affiliation Collaboration Andrew C. Miller Harvard University acm@seas.harvard.edu Nicholas J. Foti University of Washington nfoti@uw.edu Alexander D Amour UC Berkeley alexdamour@berkeley.edu Ryan P. Adams Google Brain and Princeton University rpa@princeton.edu
Pseudocode Yes Algorithm 1 Gradient descent with RV-RGE with a diagonal Gaussian variational family
Open Source Code Yes Code is available at https://github.com/andymiller/ReducedVarianceRepGrads.
Open Datasets Yes Bayesian Neural Network: The non-conjugate bnn model is a Bayesian neural network applied to the wine dataset, (see Appendix C.2)
Dataset Splits No The paper uses the terms 'train', 'validation', and 'test' in general contexts but does not provide specific data split information (exact percentages, sample counts, or detailed methodology) for reproducing the data partitioning.
Hardware Specification No The paper does not provide any specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software like 'adam [13]', 'TensorFlow [1]', 'Pytorch [20]', and 'Autograd [15]', but it does not provide specific version numbers for these software components or libraries.
Experiment Setup Yes We compare the progress of the adam algorithm using various numbers of samples [13], fixing the learning rate.