Dirichlet-based Gaussian Processes for Large-scale Calibrated Classification
Authors: Dimitrios Milios, Raffaello Camoriano, Pietro Michiardi, Lorenzo Rosasco, Maurizio Filippone
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results show that the proposed approach provides essentially the same accuracy and uncertainty quantification as Gaussian process classification while requiring only a fraction of computational resources. 5 Experiments We experimentally evaluate the methodologies discussed on the datasets outlined in Table 1. |
| Researcher Affiliation | Academia | Dimitrios Milios EURECOM Sophia Antipolis, France dimitrios.milios@eurecom.fr Raffaello Camoriano LCSL IIT (Italy) & MIT (USA) raffaello.camoriano@iit.it Pietro Michiardi EURECOM Sophia Antipolis, France pietro.michiardi@eurecom.fr Lorenzo Rosasco DIBRIS Università degli Studi di Genova, Italy LCSL IIT (Italy) & MIT (USA) lrosasco@mit.edu Maurizio Filippone EURECOM Sophia Antipolis, France maurizio.filippone@eurecom.fr |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1 The code is available at https://github.com/dmilios/dirichlet GPC. |
| Open Datasets | Yes | Table 1: Datasets used for evaluation, available from the UCI repository [1]. Dataset Classes Training instances Test instances Dimensionality Inducing points ... [1] A. Asuncion and D. J. Newman. UCI machine learning repository, 2007. |
| Dataset Splits | Yes | For GPR we further split each training dataset: 80% of which is used to train the model and the remaining 20% is used for calibration with Platt scaling. NKRR uses an 80-20% split for k-fold cross-validation and Platt scaling calibration, respectively. |
| Hardware Specification | Yes | R. C. and L. R. gratefully acknowledge the support of NVIDIA Corporation for the donation of the Titan Xp GPUs and the Tesla k40 GPU used for this research. |
| Software Dependencies | No | For the implementation of GP-based models, we use and extend the algorithms available in the GPflow library [18]. (No specific version number mentioned for GPFlow or TensorFlow, only the library name.) |
| Experiment Setup | Yes | In all experiments, we consider an isotropic RBF kernel; the kernel hyper-parameters are selected by maximizing the marginal likelihood for the GP-based approaches, and by k-fold cross validation for NKRR (with k = 10 for all datasets except from SUSY, for which k = 5). In the case of GPD, kernel parameters are shared across classes so they are informed by all classes. In the case of GPR, we also optimize the noise variance jointly with all kernel parameters. For each of the datasets, the αϵ parameter of GPD was selected according to the training MNLL: we have 0.1 for COVERBIN, 0.001 for LETTER, DRIVE and MOCAP, and 0.01 for the remaining datasets. |