Calibrated Structured Prediction
Authors: Volodymyr Kuleshov, Percy S. Liang
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our proposed recalibrators and features on three real-world tasks. Multiclass image classification. ... Optical character recognition. ... Scene understanding. ... We report results using calibration curves... |
| Researcher Affiliation | Academia | Volodymyr Kuleshov Department of Computer Science Stanford University Stanford, CA 94305 Percy Liang Department of Computer Science Stanford University Stanford, CA 94305 |
| Pseudocode | Yes | Algorithm 1 Recalibration procedure for calibrated structured prediction. Input: Features φ(x, E) from trained model pθ, event set I(x), recalibration set S = {(xi, yi)}n i=1. Output: Forecaster F(x, E). Construct the events dataset: Sbinary = {(φ(x, E), I[y E]) : (x, y) S, E I(x)} Train the forecaster F (e.g., k-NN or decision trees) on Sbinary. |
| Open Source Code | Yes | All code, data, and experiments for this paper are available on Coda Lab at https://www.codalab.org/worksheets/0xecc9a01cfcbc4cd6b0444a92d259a87c/. |
| Open Datasets | Yes | We perform our experiments on the CIFAR-10 dataset [15], which consists of 60,000 32x32 color images of different types of animals and vehicles (ten classes in total). |
| Dataset Splits | Yes | 38,000 images were used for training, 2,000 for calibration, and 20,000 for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. It mentions running experiments but does not specify the underlying hardware. |
| Software Dependencies | No | The paper mentions software components and algorithms like 'linear SVM', 'CRFs', 'k-NN', 'decision trees', and 'AD3', but it does not provide specific version numbers for these software dependencies or libraries. |
| Experiment Setup | Yes | We use decision trees and k-NN as our recalibration algorithms... We further discretize probabilities into buckets of size 0.1... For each N and each algorithm we choose a hyperparameter (minimum leaf size for decision trees, k in k-NN) by 10-fold crossvalidation on S. We tried values between 5 and 500 in increments of 5. |