Joint Inference for Neural Network Depth and Dropout Regularization
Authors: Kishan K C, Rui Li, MohammadMahdi Gilany
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across domains show that by adapting network depth and dropout regularization to data, our method achieves superior performance comparing to state-of-the-art methods with well-calibrated uncertainty estimates. |
| Researcher Affiliation | Academia | Kishan K C1 Rui Li1 Mahdi Gilany2 1Rochester Institute of Technology 2Queens University |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See Appendix |
| Open Datasets | Yes | We evaluate all methods on UCI datasets [37] using standard splits [38]...MNIST [41], Fashion MNIST [42], SVHN [43], and CIFAR10 [44]. |
| Dataset Splits | Yes | We incrementally generate 20, 500 and 2000 data points from a periodic function [14]:...with 20% for validation. |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix. While the paper states this information is in the Appendix, the Appendix itself is not part of the provided document, so specific hardware details are not present in the main paper. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) were explicitly mentioned in the paper's main text. |
| Experiment Setup | Yes | We set the maximum number of neurons per layer M = 20, and use leaky Re LU activation functions in the input and hidden layers with batch normalization to retain stability. We simulate 5 samples per minibatch to approximate the ELBO in (10), and evaluate convergence by computing the cross-correlations of the sample values over 3000 epochs. |