reproducibilityindex.ai

JAWS: Auditing Predictive Uncertainty Under Covariate Shift

Authors: Drew Prinster, Anqi Liu, Suchi Saria

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Practically, JAWS outperform state-of-the-art predictive inference baselines in a variety of biased real world data sets for interval-generation and error-assessment predictive uncertainty auditing tasks.
Researcher Affiliation	Academia	Drew Prinster Department of Computer Science Johns Hopkins University Baltimore, MD 21211 drew@cs.jhu.edu Anqi Liu Department of Computer Science Johns Hopkins University Baltimore, MD 21211 aliu@cs.jhu.edu Suchi Saria Department of Computer Science Johns Hopkins University Baltimore, MD 21211 ssaria@cs.jhu.edu
Pseudocode	No	The paper describes algorithms but does not include a formal pseudocode block or algorithm environment.
Open Source Code	Yes	Additional analysis in Appendix D and code at https://github.com/drewprinster/jaws.git.
Open Datasets	Yes	We conduct experiments on ﬁve UCI datasets Dua and Graff [2017] with various dimensionality (Table 2): airfoil self-noise, red wine quality prediction [Cortez et al., 2009], wave energy converters, superconductivity [Hamidieh, 2018], and communities and crime [Redmond and Baveja, 2002].
Dataset Splits	No	We first randomly sample 200 points for the training data, and then sample the biased test data from the remaining datapoints that are not used for training with probabilities proportional to exponential tilting weights. No explicit validation split information or specific percentages for training/testing are provided beyond the 200 training points and 'remaining datapoints' for test.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU models, CPU models, memory details) used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used in the implementation.
Experiment Setup	Yes	For neural network predictors, we use a 3-layer feed-forward neural network with ReLU activations and 512, 256, 128 units respectively. For random forest predictors, we use an ensemble of 100 decision trees. The learning rate for neural networks was chosen by tuning over the range {1e-3, 1e-4, 1e-5} and the number of training epochs was chosen by tuning over {50, 100, 150, 200}.