Differentiable Compositional Kernel Learning for Gaussian Processes
Authors: Shengyang Sun, Guodong Zhang, Chaoqi Wang, Wenyuan Zeng, Jiaman Li, Roger Grosse
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conducted a series of experiments to measure the NKN s predictive ability in several settings: time series, regression benchmarks, and texture images. We focused in particular on extrapolation, since this is a strong test of whether it has uncovered the underlying structure. Furthermore, we tested the NKN on Bayesian Optimization, where model structure and calibrated uncertainty can each enable more efficient exploration. Code is available at git@github.com: ssydasheng/Neural-Kernel-Network.git |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, University of Toronto, Toronto, ON, CA. 2Vector Institute. 3Uber Advanced Technologies Group, Toronto, ON, CA. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at git@github.com: ssydasheng/Neural-Kernel-Network.git |
| Open Datasets | Yes | UCI collection (Asuncion & Newman, 2007) |
| Dataset Splits | Yes | To select the mixture number of SM kernels, we further subdivided the training set into a training and validation set, using the same PCA-splitting method as described above. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | For all experiments, the NKN uses 6 primitive kernels including 2 RQ, 2 RBF, and 2 LIN. The following layers are organized as Linear8-Product4-Linear4-Product2-Linear1. We trained both the variance and d-dimensional lengthscales for all kernels. |