Locally Valid and Discriminative Prediction Intervals for Deep Learning Models

Authors: Zhen Lin, Shubhendu Trivedi, Jimeng Sun

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically verify, using diverse datasets, that besides being the only locally valid method for DL, LVD also exceeds or matches the performance (including coverage rate and prediction accuracy) of existing uncertainty quantification methods, while offering additional benefits in scalability and flexibility.
Researcher Affiliation Academia Zhen Lin University of Illinois at Urbana-Champaign Urbana, IL 61801 zhenlin4@illinois.edu Shubhendu Trivedi MIT Cambridge, MA 02139 shubhendu@csail.mit.edu Jimeng Sun University of Illinois at Urbana-Champaign Urbana, IL 61801 jimeng@illinois.edu
Pseudocode Yes Algorithm 1 LVD Input: Strain: A set of observations {Zi = (Xi, Yi)}N i=1 α: Parameter specifying (local) target coverage rate XN+1: Unseen data point Output: A locally valid PI, ˆCα(XN+1).
Open Source Code Yes The code to replicate all our results can be found at https://github.com/zlin7/LVD.
Open Datasets Yes We will be using a series of standard benchmark datasets in the uncertainty literature [1, 29, 15], including: UCI Yacht Hydrodynamics (Yacht) [38], UCI Bikesharing (Bike) [35], UCI Energy Efficiency (Energy) [37], UCI Concrete Compressive Strength (Concrete) [36], Boston Housing (Housing) [9], Kin8nm [16].We also use QM8 (16 sub-tasks) and QM9 (12 sub-tasks) [28, 30, 27] as examples of more complicated datasets.
Dataset Splits Yes At the onset, we partition Strain of N data points into two sets Sembed and Sconformal. We will denote Sembed as {Zi}n i=1 and Sconformal as {Zn+i}m i=1, where m = N n. Sembed is used to learn an embedding function f and a kernel K, and Sconformal is used for conformal prediction. ... In each experiment, 20% of the data is used for testing. ... Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] In the Appendix due to space constraint.
Hardware Specification Yes For the largest dataset, QM9, the extra inference time of LVD vs. inference time of the original NN is 0.65 vs. 0.75 second per 1000 samples6 on an NVIDIA 2080Ti GPU.
Software Dependencies No The paper mentions 'Pytorch' and 'Adam' as software used (references [26] and [17] respectively) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup No The paper states: 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] In the Appendix due to space constraint.' This indicates that detailed experimental setup information, such as specific hyperparameter values, is not present in the main text.