reproducibilityindex.ai

Stochastic Variational Deep Kernel Learning

Authors: Andrew G. Wilson, Zhiting Hu, Russ R. Salakhutdinov, Eric P. Xing

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show improved performance over stand alone deep networks, SVMs, and state of the art scalable Gaussian processes on several classiﬁcation benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and Image Net. 5 Experiments We evaluate our proposed approach, stochastic variational deep kernel learning (SV-DKL), on a wide range of classiﬁcation problems, including an airline delay task with over 5.9 million data points (section 5.1), a large and diverse collection of classiﬁcation problems from the UCI repository (section 5.2), and image classiﬁcation benchmarks (section 5.3).
Researcher Affiliation	Academia	Andrew Gordon Wilson* Cornell University Zhiting Hu* CMU Ruslan Salakhutdinov CMU Eric P. Xing CMU
Pseudocode	No	The paper describes procedures in prose but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	We achieve good predictive accuracy and scalability over a wide range of classiﬁcation tasks, while retaining a straightforward, general purpose, and highly practical probabilistic non-parametric representation, with code available at https://people.orie.cornell.edu/andrew/code.
Open Datasets	Yes	on several classiﬁcation benchmarks, including an airline delay dataset containing 6 million training points, CIFAR, and Image Net. We first consider a large airline dataset consisting of ﬂight arrival and departure details for all commercial ﬂights within the US in 2008.
Dataset Splits	No	Following Hensman et al. [11], we selected a hold-out sets of 100,000 points uniformly at random, and the results of DNN and SV-DKL are averaged over 5 runs one standard deviation.
Hardware Specification	Yes	All experiments were performed on a Linux machine with eight 4.0GHz CPU cores, one Tesla K40c GPU, and 32GB RAM.
Software Dependencies	No	The paper mentions implementing deep neural networks with Caffe, but no specific version number for Caffe or other software dependencies is provided.
Experiment Setup	Yes	We initialized A to be an identity matrix, and optimized in the joint learning procedure to recover cross-dimension correlations from data. We first train a deep neural network using SGD with the softmax loss objective, and rectified linear activation functions. We achieve good performance setting the number of samples T = 1 in Eq. 4 for expectation estimation in variational inference... The SV-DKL joint training was conducted using a large minibatch size of 50,000 to reduce the variance of the stochastic gradient. We used a minibatch size of 5,000 for stochastic gradient training of SV-DKL.