Scale Mixtures of Neural Network Gaussian Processes
Authors: Hyungi Lee, Eunggu Yun, Hongseok Yang, Juho Lee
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show the usefulness of our construction on various real-world regression and classification tasks. We demonstrate that, despite the increased flexibility, the scale mixture of NNGPs is readily applicable to most of the problems where NNGPs are used, without increasing the difficulty of inference. Moreover, the heavy-tailed processes derived from our construction are shown to be more robust than NNGPs for out-of-distribution or corrupted data while maintaining similar performance for the normal data. Our empirical analysis suggests that our construction is not merely a theoretical extension of the existing framework, but also provides a practical alternative to NNGPs. |
| Researcher Affiliation | Collaboration | Hyungi Lee1, Eunggu Yun1, Hongseok Yang1,2,3, Juho Lee1,4 1Kim Jaechul Graduate School of AI, KAIST, South Korea 2School of Computing, KAIST, South Korea 3Discrete Mathematics Group, Institute for Basic Science (IBS), Daejeon, South Korea 4AITRICS, Seoul, South Korea |
| Pseudocode | No | The paper describes inference algorithms (Stochastic Variational Gaussian Process, Stochastic Variational Student t Process, Importance Sampling) in textual form within Appendices C, D, and E, but does not provide them as structured pseudocode or algorithm blocks with formal labels. |
| Open Source Code | Yes | The experiment code is available at Git Hub3. We used a server with Intel Xeon Silver 4214R CPU and 128GB of RAM to evaluate the classification with Gaussian likelihood experiment, and used NVIDIA Ge Force RTX 2080Ti GPU to conduct other experiments. 3https://github.com/Anonymous0109/Scale-Mixtures-of-NNGP |
| Open Datasets | Yes | We tested the scale mixtures of NNGPs with inverse gamma prior on eight datasets collected from UCI repositories 1 and measured the negative log-likelihood values on the test set. 1https://archive.ics.uci.edu/ml/datasets.php |
| Dataset Splits | Yes | For the regression experiments, we divide each dataset into train/validation/test sets with the ratio 0.8/0.1/0.1. |
| Hardware Specification | Yes | We used a server with Intel Xeon Silver 4214R CPU and 128GB of RAM to evaluate the classification with Gaussian likelihood experiment, and used NVIDIA Ge Force RTX 2080Ti GPU to conduct other experiments. |
| Software Dependencies | No | Our implementation used Neural Tangents library (Novak et al., 2020) and JAX (Bradbury et al., 2018). (Section 4) and "For the classification experiments, we divide each dataset into train/test sets as provided by Tensor Flow Datasets2, and further divide train sets into train/validation sets with the ratio 0.9/0.1. 2https://www.tensorflow.org/datasets" (Section G). None of these mention specific version numbers. |
| Experiment Setup | Yes | In Fig. 1, we used the inverse gamma prior with hyperparameter setting α = 2 and β = 2 for the experiments validating convergence of initial distribution (Theorem 3.1) and last layer training (Theorem 3.2). For the full layer training (Theorem 3.3), we used α = 1 and β = 1. For the regression experiments, we divide each dataset into train/validation/test sets with the ratio 0.8/0.1/0.1. ... We choose the best hyperparameters based on validation NLL values... |