reproducibilityindex.ai

Fast Second Order Stochastic Backpropagation for Variational Inference

Authors: Kai Fan, Ziteng Wang, Jeff Beck, James Kwok, Katherine A. Heller

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our method on several real-world datasets and provide comparisons with other stochastic gradient methods to show substantial enhancement in convergence rates.
Researcher Affiliation	Academia	Kai Fan Duke University kai.fan@stat.duke.edu Ziteng Wang HKUST wangzt2012@gmail.com Jeffrey Beck Duke University jeff.beck@duke.edu James T. Kwok HKUST jamesk@cse.ust.hk Katherine Heller Duke University kheller@gmail.com
Pseudocode	Yes	Algorithm 1 Hessian-free Algorithm on Stochastic Gaussian Variational Inference (HFSGVI)
Open Source Code	No	The paper does not provide an explicit statement or link indicating the release of open-source code for the described methodology.
Open Datasets	Yes	We apply our algorithm to this variational logistic regression on three appropriate datasets: Duke Breast and Leukemia are small size but high-dimensional for sparse logistic regression, and a9a which is large. ... The datasets we used are images from the Frey Face, Olivetti Face and MNIST.
Dataset Splits	No	The paper mentions that hyperparameters were tuned and cross-validation was used, but it does not provide specific details on validation dataset splits (percentages or counts) that would be needed for reproduction, beyond total train/test counts for some datasets.
Hardware Specification	No	The paper mentions 'GPU' generally but does not specify any particular GPU model, CPU model, or detailed computer specifications used for the experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup	Yes	The experimental setting is as follows. The initial weights are randomly drawn from N(0, 0.012I) or N(0, 0.0012I), while all bias terms are initialized as 0. The variational lower bound only introduces the regularization on the encoder parameters, so we add an L2 regularizer on decoder parameters with a shrinkage parameter 0.001 or 0.0001. The number of hidden nodes for encoder and decoder is the same for all auto-encoder model, which is reasonable and convenient to construct a symmetric structure. The number is always tuned from 200 to 800 with 100 increment. The mini-batch size is 100 for L-BFGS and Ada, while larger mini-batch is recommended for HF, meaning it should vary according to the training size.