Locally Valid and Discriminative Prediction Intervals for Deep Learning Models
Authors: Zhen Lin, Shubhendu Trivedi, Jimeng Sun
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically verify, using diverse datasets, that besides being the only locally valid method for DL, LVD also exceeds or matches the performance (including coverage rate and prediction accuracy) of existing uncertainty quantification methods, while offering additional benefits in scalability and flexibility. |
| Researcher Affiliation | Academia | Zhen Lin University of Illinois at Urbana-Champaign Urbana, IL 61801 zhenlin4@illinois.edu Shubhendu Trivedi MIT Cambridge, MA 02139 shubhendu@csail.mit.edu Jimeng Sun University of Illinois at Urbana-Champaign Urbana, IL 61801 jimeng@illinois.edu |
| Pseudocode | Yes | Algorithm 1 LVD Input: Strain: A set of observations {Zi = (Xi, Yi)}N i=1 α: Parameter specifying (local) target coverage rate XN+1: Unseen data point Output: A locally valid PI, ˆCα(XN+1). |
| Open Source Code | Yes | The code to replicate all our results can be found at https://github.com/zlin7/LVD. |
| Open Datasets | Yes | We will be using a series of standard benchmark datasets in the uncertainty literature [1, 29, 15], including: UCI Yacht Hydrodynamics (Yacht) [38], UCI Bikesharing (Bike) [35], UCI Energy Efficiency (Energy) [37], UCI Concrete Compressive Strength (Concrete) [36], Boston Housing (Housing) [9], Kin8nm [16].We also use QM8 (16 sub-tasks) and QM9 (12 sub-tasks) [28, 30, 27] as examples of more complicated datasets. |
| Dataset Splits | Yes | At the onset, we partition Strain of N data points into two sets Sembed and Sconformal. We will denote Sembed as {Zi}n i=1 and Sconformal as {Zn+i}m i=1, where m = N n. Sembed is used to learn an embedding function f and a kernel K, and Sconformal is used for conformal prediction. ... In each experiment, 20% of the data is used for testing. ... Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] In the Appendix due to space constraint. |
| Hardware Specification | Yes | For the largest dataset, QM9, the extra inference time of LVD vs. inference time of the original NN is 0.65 vs. 0.75 second per 1000 samples6 on an NVIDIA 2080Ti GPU. |
| Software Dependencies | No | The paper mentions 'Pytorch' and 'Adam' as software used (references [26] and [17] respectively) but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | No | The paper states: 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] In the Appendix due to space constraint.' This indicates that detailed experimental setup information, such as specific hyperparameter values, is not present in the main text. |