Conformalized Quantile Regression
Authors: Yaniv Romano, Evan Patterson, Emmanuel Candes
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | supplemented by extensive experiments on popular regression datasets. We compare the efficiency of conformalized quantile regression to other conformal methods, showing that our method tends to produce shorter intervals. We conduct the experiments on eleven benchmark datasets for regression, listed in the supplementary material. |
| Researcher Affiliation | Academia | Yaniv Romano Department of Statistics Stanford University Evan Patterson Department of Statistics Stanford University Emmanuel J. Candès Departments of Mathematics and of Statistics Stanford University |
| Pseudocode | Yes | Algorithm 1: Split Conformal Quantile Regression. Input: Data (Xi, Yi), 1 i n; miscoverage level α (0, 1); quantile regression algorithm A. Process: Randomly split {1, . . . , n} into two disjoint sets I1 and I2. Fit two conditional quantile functions: {ˆqαlo, ˆqαhi} A({(Xi, Yi) : i I1}). Compute Ei for each i I2, as in equation (6). Compute Q1 α(E, I2), the (1 α)(1 + 1/|I2|)-th empirical quantile of {Ei : i I2}. Output: Prediction interval C(x) = [ˆqαlo(x) Q1 α(E, I2), ˆqαhi(x) + Q1 α(E, I2)] for Xn+1 = x. |
| Open Source Code | Yes | An implementation of CQR is available online at https://github.com/yromano/cqr. |
| Open Datasets | Yes | We conduct the experiments on eleven benchmark datasets for regression, listed in the supplementary material. Physicochemical properties of protein tertiary structure data set. https://archive.ics.uci.edu/ml/datasets/Physicochemical+Properties+of+Protein+Tertiary+Structure. Accessed: January, 2019. |
| Dataset Splits | Yes | 80% of the examples are used for training and the remaining 20% for testing. The proper training and calibration sets for split conformal prediction have equal size. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using "quantile neural networks [20] and quantile regression forests [22]" but does not specify any software names with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | Consider, for instance, the tuning of typical hyper-parameters of neural networks, such as the batch size, the learning rate, and the number of epochs. The hyperparameters may be selected, as usual, by cross validation, where we minimize the average interval length over the folds. We can mitigate this problem by tuning the nominal quantiles of the underlying method as additional hyper-parameters in cross validation. To reduce the computational cost, instead of fitting two separate neural networks... we can replace the standard one-dimensional estimate of the unknown response by a two-dimensional estimate... We adopt this approach in the experiments of Section 6. |