Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods

Authors: Shang Liu, Zhongze Cai, Xiaocheng Li

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments show the advantage of such a simple approach under various metrics, and also under covariates shift. We hope our work provides a simple benchmark and a starting point of theoretical ground for future research on regression calibration.
Researcher Affiliation Academia Shang Liu Imperial College Business School Imperial College London s.liu21@imperial.ac.ukZhongze Cai Imperial College Business School Imperial College London z.cai22@imperial.ac.ukXiaocheng Li Imperial College Business School Imperial College London xiaocheng.li@imperial.ac.uk
Pseudocode Yes Algorithm 1 Simple Nonparametric Quantile Estimator; Algorithm 2 Nonparametric Regression Calibration
Open Source Code Yes All codes are available at https://github.com/Zhongze Cai/NRC.
Open Datasets Yes We evaluate our NRC algorithms on the standard 8 UCI datasets Dua and Graff [2017]. The Bike-Sharing dataset provided in Fanaee-T and Gama [2014] is used to visually evaluate our proposed algorithm. We examine them against several high-dimensional datasets: the medical expenditure panel survey (MEPS) datasets [panel 19, 2017, panel 20, 2017, panel 21, 2017], as suggested in [Romano et al., 2019].
Dataset Splits Yes We generate ntrain = 40000 samples for training and ntest = 1000 for testing. For our NRC method, we randomly select n1 = 36000 for training a regression model and ntrain n1 = 4000 samples for the quantile calibration step. Each time we randomly split out 10% of the whole sample as the testing set. On the remaining set, for our NRC algorithms we additionally separate out 30% for recalibration, and for the rest algorithms, no additional partitioning is required. The rest data is then fed into the whole train-validation loop, and we set up an early stopping scheme with a patience count of 20. We split the starting 80% of time stamps for training (and further split for recalibration, if required), and the ending 20% of time stamps for testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments, only general mentions of neural networks.
Software Dependencies No The paper mentions deploying 'LSTM networks' and 'neural network' models but does not provide specific version numbers for any software dependencies or libraries used.
Experiment Setup Yes All neural networks used in this example have hidden layers of size [100, 50]. For all the network structures used in Section 4 we fix the size of the hidden layers to [20, 20] (and [10] for the DGP model). For our dimension reduction algorithms, the target dimension that we reduce to is set to 4. The rest hyperparameters (including the learning rate, minibatch size, kernel width, etc) are pre-determined by running a grid search. The learning rate is searched within [10-2, 5*10-3, 10-3], the minibatch size is searched within [10, 64, 128], and the rest hyperparameters exclusive to each model are searched in the same fashion.