Structured Dropout Variational Inference for Bayesian Neural Networks
Authors: Son Nguyen, Duong Nguyen, Khai Nguyen, Khoat Than, Hung Bui, Nhat Ho
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct extensive experiments on standard benchmarks to demonstrate the effectiveness of VSD over state-of-the-art variational methods on predictive accuracy, uncertainty estimation, and out-of-distribution detection. |
| Researcher Affiliation | Collaboration | Son Nguyen ,1 Duong Nguyen3 Khai Nguyen1 Khoat Than3,1 Hung Bui ,1 Nhat Ho ,2 1 Vin AI Research, Viet Nam 2 University of Texas, Austin 3 Hanoi University of Science and Technology |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statement or link indicating the availability of source code for the described methodology. |
| Open Datasets | Yes | Finally, we carry out extensive experiments with standard datasets and different network architectures to validate the effectiveness of our method on many criteria, including scalability, predictive accuracy, uncertainty calibration, and out-of-distribution detection, in comparison to popular variational inference methods. (Section 1 - Contributions, point 5) We now compare the predictive performance of the aforementioned methods for classification tasks on three standard image datasets: MNIST [45], CIFAR10 [43], and SVHN [64]. (Section 4.1) We reproduce the experiments proposed in [80] (ELRG), in which we trained Alex Net and Res Net18 on 4 datasets CIFAR10, SVHN, CIFAR100 [43], and STIL10 [11] to evaluate the predictive performance of our proposal compared to other methods. (Section 4.2) |
| Dataset Splits | Yes | Details about data descriptions, network architectures, hyper-parameter tuning are presented in Appendix I.4. We follow common practices for splitting the datasets, which typically involves a train/validation/test split or a train/test split with early stopping on a validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only generally mentions capabilities like being able to be "employed for deep CNNs". |
| Software Dependencies | No | The paper mentions using "Adam optimizer [37]" but does not specify software names with version numbers for libraries or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | All models are trained with Adam optimizer [37] with initial learning rate 1e-3, and step decay at epochs 100, 150 (for MNIST, CIFAR10, SVHN) and 30, 60, 90 (for CIFAR100, STL10) by a factor of 0.1. (Appendix I.4) |