reproducibilityindex.ai

Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Authors: Neal Jean, Sang Michael Xie, Stefano Ermon

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply SSDKL to a variety of real-world regression tasks in the inductive semi-supervised learning setting, beginning with eight datasets from the UCI repository [23]. We also explore the challenging task of predicting local poverty measures from high-resolution satellite imagery [24]. In our reported results, we use the squared exponential or radial basis function kernel.
Researcher Affiliation	Academia	Neal Jean , Sang Michael Xie , Stefano Ermon Department of Computer Science Stanford University Stanford, CA 94305 {nealjean, xie, ermon}@cs.stanford.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	code and data for reproducing experimental results can be found on Git Hub.2 https://github.com/ermongroup/ssdkl
Open Datasets	Yes	We apply SSDKL to a variety of real-world regression tasks in the inductive semi-supervised learning setting, beginning with eight datasets from the UCI repository [23].
Dataset Splits	Yes	For each dataset, we train on n = {50, 100, 200, 300, 400, 500} labeled examples, retain 1000 examples as the hold out test set, and treat the remaining data as unlabeled examples. Following [29], the labeled data is randomly split 90-10 into training and validation samples, giving a realistically small validation set.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., CPU, GPU, or cloud instance types) used for running the experiments.
Software Dependencies	No	Our SSDKL model is implemented in Tensor Flow [25]. No version number for TensorFlow is specified.
Experiment Setup	Yes	All kernel hyperparameters are optimized directly through Lsemisup, and we use the validation set for early stopping to prevent overﬁtting and for selecting α {0.1, 1, 10}. ... we choose a neural network with a similar [d-100-50-50-2] architecture and twodimensional embedding. ... We use learning rates of 1 10 3 and 0.1 for the neural network and GP parameters respectively and initialize all GP parameters to 1.