Variational Bayes on Monte Carlo Steroids

Authors: Aditya Grover, Stefano Ermon

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate empirical improvements on benchmark datasets in vision and language for sigmoid belief networks, where a neural network is used to approximate the posterior.
Researcher Affiliation Academia Aditya Grover, Stefano Ermon Department of Computer Science Stanford University {adityag,ermon}@cs.stanford.edu
Pseudocode Yes Algorithm 1 VB-MCS: Learning belief networks with random projections.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes We trained a generative model for silhouette images of 28 × 28 dimensions from the Caltech 101 Silhouettes dataset 3. The dataset consists of 4,100 train images, 2,264 validation images and 2,307 test images. ... We performed the second set of experiments on the latest version of the NIPS Proceedings dataset 4 which consists of the distribution of words in all papers that appeared in NIPS from 1988-2003. We performed a 80/10/10 split of the dataset into 1,986 train, 249 validation, and 248 test documents. Footnote 3: Available at https://people.cs.umass.edu/~marlin/data.shtml. Footnote 4: Available at http://ai.stanford.edu/~gal/
Dataset Splits Yes The dataset consists of 4,100 train images, 2,264 validation images and 2,307 test images. ... We performed a 80/10/10 split of the dataset into 1,986 train, 249 validation, and 248 test documents.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies No The paper mentions "The optimizer used is Adam [16]", but it does not list any specific software libraries, frameworks, or programming language versions (e.g., Python, PyTorch, TensorFlow, CUDA) with version numbers.
Experiment Setup Yes The learning rate was fixed based on validation performance to 3 × 10−4 for the generator network and reduced by a factor of 5 for the inference network. Mini-batch size was fixed to 20. Regularization was imposed by early stopping of training after 50 epochs. The optimizer used is Adam [16]. For k-SBN, we show results for three values of k: 5, 10, and 20, and the aggregation is done using the median estimator with T = 3.