reproducibilityindex.ai

Locally Valid and Discriminative Prediction Intervals for Deep Learning Models

Authors: Zhen Lin, Shubhendu Trivedi, Jimeng Sun

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically verify, using diverse datasets, that besides being the only locally valid method for DL, LVD also exceeds or matches the performance (including coverage rate and prediction accuracy) of existing uncertainty quantiﬁcation methods, while offering additional beneﬁts in scalability and ﬂexibility.
Researcher Affiliation	Academia	Zhen Lin University of Illinois at Urbana-Champaign Urbana, IL 61801 zhenlin4@illinois.edu Shubhendu Trivedi MIT Cambridge, MA 02139 shubhendu@csail.mit.edu Jimeng Sun University of Illinois at Urbana-Champaign Urbana, IL 61801 jimeng@illinois.edu
Pseudocode	Yes	Algorithm 1 LVD Input: Strain: A set of observations {Zi = (Xi, Yi)}N i=1 α: Parameter specifying (local) target coverage rate XN+1: Unseen data point Output: A locally valid PI, ˆCα(XN+1).
Open Source Code	Yes	The code to replicate all our results can be found at https://github.com/zlin7/LVD.
Open Datasets	Yes	We will be using a series of standard benchmark datasets in the uncertainty literature [1, 29, 15], including: UCI Yacht Hydrodynamics (Yacht) [38], UCI Bikesharing (Bike) [35], UCI Energy Efﬁciency (Energy) [37], UCI Concrete Compressive Strength (Concrete) [36], Boston Housing (Housing) [9], Kin8nm [16].We also use QM8 (16 sub-tasks) and QM9 (12 sub-tasks) [28, 30, 27] as examples of more complicated datasets.
Dataset Splits	Yes	At the onset, we partition Strain of N data points into two sets Sembed and Sconformal. We will denote Sembed as {Zi}n i=1 and Sconformal as {Zn+i}m i=1, where m = N n. Sembed is used to learn an embedding function f and a kernel K, and Sconformal is used for conformal prediction. ... In each experiment, 20% of the data is used for testing. ... Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] In the Appendix due to space constraint.
Hardware Specification	Yes	For the largest dataset, QM9, the extra inference time of LVD vs. inference time of the original NN is 0.65 vs. 0.75 second per 1000 samples6 on an NVIDIA 2080Ti GPU.
Software Dependencies	No	The paper mentions 'Pytorch' and 'Adam' as software used (references [26] and [17] respectively) but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	No	The paper states: 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] In the Appendix due to space constraint.' This indicates that detailed experimental setup information, such as specific hyperparameter values, is not present in the main text.