reproducibilityindex.ai

Predictive inference is free with the jackknife+-after-bootstrap

Authors: Byol Kim, Chen Xu, Rina Barber

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our numerical experiments verify the coverage and accuracy of the resulting predictive intervals on real data.
Researcher Affiliation	Academia	Department of Statistics, The University of Chicago, Chicago, IL 60637, USA byolkim@uchicago.edu H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA cxu310@gatech.edu Department of Statistics, The University of Chicago, Chicago, IL 60637, USA rina@uchicago.edu
Pseudocode	Yes	Algorithm 1 Ensemble learning Input: Data {(Xi, Yi)}n i=1 Output: Ensembled regression function bµϕ ... Algorithm 2 Jackknife+-after-bootstrap (J+a B) Input: Data {(Xi, Yi)}n i=1 Output: Predictive interval b CJ+a B α,n,B ... Algorithm 3 Lifted J+a B residuals Input: Data {(Xi, Yi)}n+1 i=1 Output: Residuals (Rij : i = j {1, .0. . , n + 1})
Open Source Code	Yes	The code is available online.6 ... 6https://www.stat.uchicago.edu/~rina/jackknife+-after-bootstrap_realdata.html
Open Datasets	Yes	We used three real data sets, which were also used in Barber et al. [2], following the same data preprocessing steps as described therein. The Communities and Crime (COMMUNITIES) data set [28] ... The Blog Feedback (BLOG) data set [9] ... The Medical Expenditure Panel Survey (MEPS) 2016 data set from the Agency for Healthcare Research and Quality, with details for older versions in [13]
Dataset Splits	No	We used n = 40 observations for training, sampling uniformly without replacement to create a training-test split for each trial.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud instance types) used for the experiments.
Software Dependencies	No	The paper mentions 'scikit-learn' but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup	Yes	For RIDGE, we set the penalty at λ = 0.001 X 2, where X is the spectral norm of the training data matrix. RF was implemented using the Random Forest Regressor method from scikit-learn with 20 trees grown for each random forest using the mean absolute error criterion and the bootstrap option turned off, with default settings otherwise. For NN, we used the MLPRegressor method from scikit-learn with the L-BFGS solver and the logistic activation function, with default settings otherwise. For the aggregation ϕ, we used averaging (MEAN). We ﬁxed α = 0.1 for the target coverage of 90%. We used n = 40 observations for training, sampling uniformly without replacement to create a training-test split for each trial. The results presented here are from 10 independent training-test splits of each data set. The ensemble wrappers J+a B and J+ENSEMBLE used sampling with replacement. We varied the size m of each bootstrap replicate as m/n = 0.2, 0.4, . . . , 1.0. For J+ENSEMBLE, we used B = 20. For the J+a B, we drew B Binomial( e B, (1 1 n+1)m) with e B = [20/{(1 1 n+1)m(1 1 n)m}], where [ ] refers to the integer part of the argument.