Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods
Authors: Shang Liu, Zhongze Cai, Xiaocheng Li
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments show the advantage of such a simple approach under various metrics, and also under covariates shift. We hope our work provides a simple benchmark and a starting point of theoretical ground for future research on regression calibration. |
| Researcher Affiliation | Academia | Shang Liu Imperial College Business School Imperial College London s.liu21@imperial.ac.ukZhongze Cai Imperial College Business School Imperial College London z.cai22@imperial.ac.ukXiaocheng Li Imperial College Business School Imperial College London xiaocheng.li@imperial.ac.uk |
| Pseudocode | Yes | Algorithm 1 Simple Nonparametric Quantile Estimator; Algorithm 2 Nonparametric Regression Calibration |
| Open Source Code | Yes | All codes are available at https://github.com/Zhongze Cai/NRC. |
| Open Datasets | Yes | We evaluate our NRC algorithms on the standard 8 UCI datasets Dua and Graff [2017]. The Bike-Sharing dataset provided in Fanaee-T and Gama [2014] is used to visually evaluate our proposed algorithm. We examine them against several high-dimensional datasets: the medical expenditure panel survey (MEPS) datasets [panel 19, 2017, panel 20, 2017, panel 21, 2017], as suggested in [Romano et al., 2019]. |
| Dataset Splits | Yes | We generate ntrain = 40000 samples for training and ntest = 1000 for testing. For our NRC method, we randomly select n1 = 36000 for training a regression model and ntrain n1 = 4000 samples for the quantile calibration step. Each time we randomly split out 10% of the whole sample as the testing set. On the remaining set, for our NRC algorithms we additionally separate out 30% for recalibration, and for the rest algorithms, no additional partitioning is required. The rest data is then fed into the whole train-validation loop, and we set up an early stopping scheme with a patience count of 20. We split the starting 80% of time stamps for training (and further split for recalibration, if required), and the ending 20% of time stamps for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments, only general mentions of neural networks. |
| Software Dependencies | No | The paper mentions deploying 'LSTM networks' and 'neural network' models but does not provide specific version numbers for any software dependencies or libraries used. |
| Experiment Setup | Yes | All neural networks used in this example have hidden layers of size [100, 50]. For all the network structures used in Section 4 we fix the size of the hidden layers to [20, 20] (and [10] for the DGP model). For our dimension reduction algorithms, the target dimension that we reduce to is set to 4. The rest hyperparameters (including the learning rate, minibatch size, kernel width, etc) are pre-determined by running a grid search. The learning rate is searched within [10-2, 5*10-3, 10-3], the minibatch size is searched within [10, 64, 128], and the rest hyperparameters exclusive to each model are searched in the same fashion. |