Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee
Authors: Jincheng Bai, Qifan Song, Guang Cheng
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Notably, our empirical results demonstrate that this variational procedure provides uncertainty quantification in terms of Bayesian predictive distribution and is also capable to accomplish consistent variable selection by training a sparse multi-layer neural network. |
| Researcher Affiliation | Academia | Jincheng Bai Department of Statistics Purdue University West Lafayette, IN 47906 bai45@purdue.edu Qifan Song Department of Statistics Purdue University West Lafayette, IN 47906 qfsong@purdue.edu Guang Cheng Department of Statistics Purdue University West Lafayette, IN 47906 chengg@purdue.edu |
| Pseudocode | Yes | Algorithm 1 Variational inference for sparse BNN with normal slab distribution. |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | We evaluate the empirical performance of the proposed variational inference through simulation study and MNIST data application. |
| Dataset Splits | No | The paper does not explicitly provide details about a validation dataset split with percentages or sample counts. It mentions 'validation' in the context of 'validity of Theorem 4.1' and 'training/testing RMSE' but not as a separate data split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch' as a software package but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For all the numerical studies, we let σ2 0 = 2, the choice of λ follows Theorem 4.1 (denoted by λopt): log(λ-1_opt) = log(T) - 0.1r(L-1)logN - log(sqrt(n*s)). The remaining details of implementation (such as initialization, choices of K, m and learning rate) are provided in the supplementary material. ... The sparsity levels specified for AGP are 30% and 5%, and for LOT are 10% and 5%, respectively for the two cases. ... using the same batch size |