Large-scale probabilistic predictors with and without guarantees of validity
Authors: Vladimir Vovk, Ivan Petej, Valentina Fedorova
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper studies theoretically and empirically a method of turning machine-learning algorithms into probabilistic predictors that automatically enjoys a property of validity (perfect calibration) and is computationally efficient. When these imprecise probabilities are merged into precise probabilities, the resulting predictors, while losing the theoretical property of perfect calibration, are consistently more accurate than the existing methods in empirical studies. |
| Researcher Affiliation | Collaboration | Department of Computer Science, Royal Holloway, University of London, UK Yandex, Moscow, Russia {volodya.vovk,ivan.petej,alushaf}@gmail.com |
| Pseudocode | Yes | Algorithm 1 CVAP(T, x) // cross-Venn Abers predictor for training set T 1: split the training set T into K folds T1, . . . , TK 2: for k {1, . . . , K} 3: (pk 0, pk 1) := IVAP(T \ Tk, Tk, x) 4: return GM(p1)/(GM(1 p0) + GM(p1)) |
| Open Source Code | No | The paper states: 'our code being publicly available [9]'. However, reference [9] is to an arXiv technical report ('ar Xiv.org e-Print archive, November 2015. A full version of this paper.'), not a direct link to a code repository like GitHub, GitLab, or Bitbucket, nor does it explicitly state the code is provided as supplementary material with the paper. |
| Open Datasets | Yes | For illustrating our results in this paper we use the adult data set available from the UCI repository [18] (this is the main data set used in [6] and one of the data sets used in [8]). |
| Dataset Splits | Yes | We use the original split of the data set into a training set of Ntrain = 32, 561 observations and a test set of Ntest = 16, 281 observations. In the case of CVAPs, the training set is split into K equal (or as close to being equal as possible) contiguous folds: the first Ntrain/K training observations are included in the first fold, the next Ntrain/K (or Ntrain/K ) in the second fold, etc. (first and then is used unless Ntrain is divisible by K). In the case of the other calibration methods, we used the first K 1 / K Ntrain training observation as the proper training set (used for training the scoring algorithm) and the rest of the training observations are used as the calibration set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It describes the software and datasets but not the underlying computational resources. |
| Software Dependencies | No | The paper mentions software used, such as 'Weka [17]', 'MATLAB s Statistics toolbox', and 'R package fdrtool (namely, the function monoreg)'. However, it does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | For each of the standard prediction algorithms within Weka that we use, we optimise the parameters by minimising the Brier loss on the calibration set, apart from the column labelled all. Most of the parameters are set to their default values, and the only parameters that are optimised are C (pruning confidence) for J48 and J48 bagging, R (ridge) for logistic regression, L (learning rate) and M (momentum) for neural networks (Multilayer Perceptron), and C (complexity constant) for SVM (SMO, with the linear kernel); na ıve Bayes does not involve any parameters. |