Scale Mixtures of Neural Network Gaussian Processes

Authors: Hyungi Lee, Eunggu Yun, Hongseok Yang, Juho Lee

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show the usefulness of our construction on various real-world regression and classification tasks. We demonstrate that, despite the increased flexibility, the scale mixture of NNGPs is readily applicable to most of the problems where NNGPs are used, without increasing the difficulty of inference. Moreover, the heavy-tailed processes derived from our construction are shown to be more robust than NNGPs for out-of-distribution or corrupted data while maintaining similar performance for the normal data. Our empirical analysis suggests that our construction is not merely a theoretical extension of the existing framework, but also provides a practical alternative to NNGPs.
Researcher Affiliation Collaboration Hyungi Lee1, Eunggu Yun1, Hongseok Yang1,2,3, Juho Lee1,4 1Kim Jaechul Graduate School of AI, KAIST, South Korea 2School of Computing, KAIST, South Korea 3Discrete Mathematics Group, Institute for Basic Science (IBS), Daejeon, South Korea 4AITRICS, Seoul, South Korea
Pseudocode No The paper describes inference algorithms (Stochastic Variational Gaussian Process, Stochastic Variational Student t Process, Importance Sampling) in textual form within Appendices C, D, and E, but does not provide them as structured pseudocode or algorithm blocks with formal labels.
Open Source Code Yes The experiment code is available at Git Hub3. We used a server with Intel Xeon Silver 4214R CPU and 128GB of RAM to evaluate the classification with Gaussian likelihood experiment, and used NVIDIA Ge Force RTX 2080Ti GPU to conduct other experiments. 3https://github.com/Anonymous0109/Scale-Mixtures-of-NNGP
Open Datasets Yes We tested the scale mixtures of NNGPs with inverse gamma prior on eight datasets collected from UCI repositories 1 and measured the negative log-likelihood values on the test set. 1https://archive.ics.uci.edu/ml/datasets.php
Dataset Splits Yes For the regression experiments, we divide each dataset into train/validation/test sets with the ratio 0.8/0.1/0.1.
Hardware Specification Yes We used a server with Intel Xeon Silver 4214R CPU and 128GB of RAM to evaluate the classification with Gaussian likelihood experiment, and used NVIDIA Ge Force RTX 2080Ti GPU to conduct other experiments.
Software Dependencies No Our implementation used Neural Tangents library (Novak et al., 2020) and JAX (Bradbury et al., 2018). (Section 4) and "For the classification experiments, we divide each dataset into train/test sets as provided by Tensor Flow Datasets2, and further divide train sets into train/validation sets with the ratio 0.9/0.1. 2https://www.tensorflow.org/datasets" (Section G). None of these mention specific version numbers.
Experiment Setup Yes In Fig. 1, we used the inverse gamma prior with hyperparameter setting α = 2 and β = 2 for the experiments validating convergence of initial distribution (Theorem 3.1) and last layer training (Theorem 3.2). For the full layer training (Theorem 3.3), we used α = 1 and β = 1. For the regression experiments, we divide each dataset into train/validation/test sets with the ratio 0.8/0.1/0.1. ... We choose the best hyperparameters based on validation NLL values...