Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance
Authors: Neal Jean, Sang Michael Xie, Stefano Ermon
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply SSDKL to a variety of real-world regression tasks in the inductive semi-supervised learning setting, beginning with eight datasets from the UCI repository [23]. We also explore the challenging task of predicting local poverty measures from high-resolution satellite imagery [24]. In our reported results, we use the squared exponential or radial basis function kernel. |
| Researcher Affiliation | Academia | Neal Jean , Sang Michael Xie , Stefano Ermon Department of Computer Science Stanford University Stanford, CA 94305 {nealjean, xie, ermon}@cs.stanford.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | code and data for reproducing experimental results can be found on Git Hub.2 https://github.com/ermongroup/ssdkl |
| Open Datasets | Yes | We apply SSDKL to a variety of real-world regression tasks in the inductive semi-supervised learning setting, beginning with eight datasets from the UCI repository [23]. |
| Dataset Splits | Yes | For each dataset, we train on n = {50, 100, 200, 300, 400, 500} labeled examples, retain 1000 examples as the hold out test set, and treat the remaining data as unlabeled examples. Following [29], the labeled data is randomly split 90-10 into training and validation samples, giving a realistically small validation set. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., CPU, GPU, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | Our SSDKL model is implemented in Tensor Flow [25]. No version number for TensorFlow is specified. |
| Experiment Setup | Yes | All kernel hyperparameters are optimized directly through Lsemisup, and we use the validation set for early stopping to prevent overfitting and for selecting α {0.1, 1, 10}. ... we choose a neural network with a similar [d-100-50-50-2] architecture and twodimensional embedding. ... We use learning rates of 1 10 3 and 0.1 for the neural network and GP parameters respectively and initialize all GP parameters to 1. |