Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee

Authors: Jincheng Bai, Qifan Song, Guang Cheng

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Notably, our empirical results demonstrate that this variational procedure provides uncertainty quantification in terms of Bayesian predictive distribution and is also capable to accomplish consistent variable selection by training a sparse multi-layer neural network.
Researcher Affiliation Academia Jincheng Bai Department of Statistics Purdue University West Lafayette, IN 47906 bai45@purdue.edu Qifan Song Department of Statistics Purdue University West Lafayette, IN 47906 qfsong@purdue.edu Guang Cheng Department of Statistics Purdue University West Lafayette, IN 47906 chengg@purdue.edu
Pseudocode Yes Algorithm 1 Variational inference for sparse BNN with normal slab distribution.
Open Source Code No The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets Yes We evaluate the empirical performance of the proposed variational inference through simulation study and MNIST data application.
Dataset Splits No The paper does not explicitly provide details about a validation dataset split with percentages or sample counts. It mentions 'validation' in the context of 'validity of Theorem 4.1' and 'training/testing RMSE' but not as a separate data split.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Py Torch' as a software package but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For all the numerical studies, we let σ2 0 = 2, the choice of λ follows Theorem 4.1 (denoted by λopt): log(λ-1_opt) = log(T) - 0.1r(L-1)logN - log(sqrt(n*s)). The remaining details of implementation (such as initialization, choices of K, m and learning rate) are provided in the supplementary material. ... The sparsity levels specified for AGP are 30% and 5%, and for LOT are 10% and 5%, respectively for the two cases. ... using the same batch size