Structured Dropout Variational Inference for Bayesian Neural Networks

Authors: Son Nguyen, Duong Nguyen, Khai Nguyen, Khoat Than, Hung Bui, Nhat Ho

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we conduct extensive experiments on standard benchmarks to demonstrate the effectiveness of VSD over state-of-the-art variational methods on predictive accuracy, uncertainty estimation, and out-of-distribution detection.
Researcher Affiliation Collaboration Son Nguyen ,1 Duong Nguyen3 Khai Nguyen1 Khoat Than3,1 Hung Bui ,1 Nhat Ho ,2 1 Vin AI Research, Viet Nam 2 University of Texas, Austin 3 Hanoi University of Science and Technology
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include any explicit statement or link indicating the availability of source code for the described methodology.
Open Datasets Yes Finally, we carry out extensive experiments with standard datasets and different network architectures to validate the effectiveness of our method on many criteria, including scalability, predictive accuracy, uncertainty calibration, and out-of-distribution detection, in comparison to popular variational inference methods. (Section 1 - Contributions, point 5) We now compare the predictive performance of the aforementioned methods for classification tasks on three standard image datasets: MNIST [45], CIFAR10 [43], and SVHN [64]. (Section 4.1) We reproduce the experiments proposed in [80] (ELRG), in which we trained Alex Net and Res Net18 on 4 datasets CIFAR10, SVHN, CIFAR100 [43], and STIL10 [11] to evaluate the predictive performance of our proposal compared to other methods. (Section 4.2)
Dataset Splits Yes Details about data descriptions, network architectures, hyper-parameter tuning are presented in Appendix I.4. We follow common practices for splitting the datasets, which typically involves a train/validation/test split or a train/test split with early stopping on a validation set.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only generally mentions capabilities like being able to be "employed for deep CNNs".
Software Dependencies No The paper mentions using "Adam optimizer [37]" but does not specify software names with version numbers for libraries or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes All models are trained with Adam optimizer [37] with initial learning rate 1e-3, and step decay at epochs 100, 150 (for MNIST, CIFAR10, SVHN) and 30, 60, 90 (for CIFAR100, STL10) by a factor of 0.1. (Appendix I.4)