reproducibilityindex.ai

Training Bayesian Neural Networks with Sparse Subspace Variational Inference

Authors: Junbo Li, Zichen Miao, Qiang Qiu, Ruqi Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive experiments show that SSVI sets new benchmarks in crafting sparse BNNs, achieving, for instance, a 10-20 compression in model size with under 3% performance drop, and up to 20 FLOPs reduction during training compared with dense VI training. We conduct experiments using CIFAR-10 and CIFAR-100 datasets (Krizhevsky, 2009) and Res Net18 (He et al., 2016) as the backbone networks.
Researcher Affiliation	Academia	Junbo Li1, Zichen Miao2, Qiang Qiu2, Ruqi Zhang1 1. Department of Computer Science 2. School of Electrical and Computer Engineering Purdue University, West Lafayette, IN 47907, USA {ljunbo,miaoz,qqiu,ruqiz}@purdue.edu
Pseudocode	Yes	Algorithm 1 Sparse Subspace Variational Inference (SSVI)
Open Source Code	Yes	We release the code at https://github.com/ljb121002/SSVI.
Open Datasets	Yes	We conduct experiments using CIFAR-10 and CIFAR-100 datasets (Krizhevsky, 2009) and Res Net18 (He et al., 2016) as the backbone networks.
Dataset Splits	No	The paper mentions using CIFAR-10 and CIFAR-100 datasets but does not explicitly provide the training, validation, or test split percentages, sample counts, or an explicit reference to using standard splits for these datasets.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning "Res Net-18 as the backbone networks" which is a model, not hardware.
Software Dependencies	No	The paper does not specify version numbers for any software components, programming languages, or libraries used in the experiments (e.g., Python version, PyTorch/TensorFlow version).
Experiment Setup	Yes	We train the model for 200 epochs, with a batch size of 128. For the optimizer, we use Adam with a learning rate of 0.001. The initial value for KL warm-up β is set to 0.1, and it increases linearly to 1.0 during the ﬁrst 50 epochs. The initial standard deviation for the BNN s weights is set to 0.01. For the weight removal rate, we set it to 0.05, and for the weight addition rate, we set it to 0.05. The inner update steps M is set to 5.