Dirichlet-based Gaussian Processes for Large-scale Calibrated Classification

Authors: Dimitrios Milios, Raffaello Camoriano, Pietro Michiardi, Lorenzo Rosasco, Maurizio Filippone

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results show that the proposed approach provides essentially the same accuracy and uncertainty quantification as Gaussian process classification while requiring only a fraction of computational resources. 5 Experiments We experimentally evaluate the methodologies discussed on the datasets outlined in Table 1.
Researcher Affiliation Academia Dimitrios Milios EURECOM Sophia Antipolis, France dimitrios.milios@eurecom.fr Raffaello Camoriano LCSL IIT (Italy) & MIT (USA) raffaello.camoriano@iit.it Pietro Michiardi EURECOM Sophia Antipolis, France pietro.michiardi@eurecom.fr Lorenzo Rosasco DIBRIS Università degli Studi di Genova, Italy LCSL IIT (Italy) & MIT (USA) lrosasco@mit.edu Maurizio Filippone EURECOM Sophia Antipolis, France maurizio.filippone@eurecom.fr
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 1 The code is available at https://github.com/dmilios/dirichlet GPC.
Open Datasets Yes Table 1: Datasets used for evaluation, available from the UCI repository [1]. Dataset Classes Training instances Test instances Dimensionality Inducing points ... [1] A. Asuncion and D. J. Newman. UCI machine learning repository, 2007.
Dataset Splits Yes For GPR we further split each training dataset: 80% of which is used to train the model and the remaining 20% is used for calibration with Platt scaling. NKRR uses an 80-20% split for k-fold cross-validation and Platt scaling calibration, respectively.
Hardware Specification Yes R. C. and L. R. gratefully acknowledge the support of NVIDIA Corporation for the donation of the Titan Xp GPUs and the Tesla k40 GPU used for this research.
Software Dependencies No For the implementation of GP-based models, we use and extend the algorithms available in the GPflow library [18]. (No specific version number mentioned for GPFlow or TensorFlow, only the library name.)
Experiment Setup Yes In all experiments, we consider an isotropic RBF kernel; the kernel hyper-parameters are selected by maximizing the marginal likelihood for the GP-based approaches, and by k-fold cross validation for NKRR (with k = 10 for all datasets except from SUSY, for which k = 5). In the case of GPD, kernel parameters are shared across classes so they are informed by all classes. In the case of GPR, we also optimize the noise variance jointly with all kernel parameters. For each of the datasets, the αϵ parameter of GPD was selected according to the training MNLL: we have 0.1 for COVERBIN, 0.001 for LETTER, DRIVE and MOCAP, and 0.01 for the remaining datasets.