Preconditioning Kernel Matrices
Authors: Kurt Cutajar, Michael Osborne, John Cunningham, Maurizio Filippone
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate datasets over a range of problem size and dimensionality. Because PCG is exact in the limit of iterations (unlike approximate techniques), we demonstrate a tradeoff between accuracy and computational effort that improves beyond state-of-the-art approximation and factorization approaches. In this section, we provide an empirical exploration of these preconditioners in a practical setting. We begin by considering three datasets for regression from the UCI repository (Asuncion & Newman, 2007), namely the Concrete dataset (n = 1030, d = 8), the Power Plant dataset (n = 9568, d = 4), and the Protein dataset (n = 45730, d = 9). |
| Researcher Affiliation | Academia | Kurt Cutajar KURT.CUTAJAR@EURECOM.FR EURECOM, Department of Data Science Michael A. Osborne MOSB@ROBOTS.OX.AC.UK University of Oxford, Department of Engineering Science John P. Cunningham JPC2181@COLUMBIA.EDU Columbia University, Department of Statistics Maurizio Filippone MAURIZIO.FILIPPONE@EURECOM.FR EURECOM, Department of Data Science |
| Pseudocode | Yes | Algorithm 1 The Preconditioned CG Algorithm, adapted from (Golub & Van Loan, 1996) Require: data X, vector v, convergence threshold ϵ, initial vector x0, maximum no. of iterations T |
| Open Source Code | Yes | Code to replicate all results in this paper is available at http://github.com/mauriziofilippone/preconditioned_GPs |
| Open Datasets | Yes | We begin by considering three datasets for regression from the UCI repository (Asuncion & Newman, 2007), namely the Concrete dataset (n = 1030, d = 8), the Power Plant dataset (n = 9568, d = 4), and the Protein dataset (n = 45730, d = 9). GP classification: Spam dataset (n = 4601, d = 57) and EEG dataset (n = 14979, d = 14). |
| Dataset Splits | Yes | All methods are initialized from the same set of kernel parameters, and the curves are averaged over 5 folds (3 for the Protein and EEG datasets). |
| Hardware Specification | Yes | For the sake of integrity, we ran each method in the comparison individually on a workstation with Intel Xeon E5-2630 CPU having 16 cores and 128GB RAM. |
| Software Dependencies | No | The paper states that "The CG, PCG and CHOL approaches have been implemented in R;" but does not specify a version for R or any specific libraries/packages with version numbers that are critical for reproducibility. It mentions GPstuff as a comparison target but also without version. |
| Experiment Setup | Yes | The convergence threshold is set to ϵ2 = n 10 10 so as to roughly accept an average error of 10 5 on each element of the solution. We focus on an isotropic RBF variant of the kernel in eq. 1, fixing the marginal variance σ2 to one. We vary the lengthscale parameter l and the noise variance λ in log10 scale. We set the stepsize to one. All methods are initialized from the same set of kernel parameters. |