Fast Nonlinear Vector Quantile Regression
Authors: Aviv A. Rosenberg, Sanketh Vedula, Yaniv Romano, Alexander Bronstein
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We use four synthetic and four real datasets which are detailed in Appendices D.1 and D.2. Except for the MVN dataset, which was used for the scale and optimization experiments, the remaining three synthetic datasets were carefully selected to be challenging since they exhibit complex nonlinear relationships between X and Y (see e.g. fig. 1b). We evaluate using the following metrics (detailed in Appendix E): (i) KDE-L1, an estimate of distance between distributions; (ii) QFD, a distance measured between an estimated CVQF and its ground truth; (iii) Inverse CVQF entropy; (iv) Monotonicity violations; (v) Marginal coverage; (vi) Size of α-confidence set. |
| Researcher Affiliation | Collaboration | Aviv A. Rosenberg1,3, , Sanketh Vedula1,3, , Yaniv Romano1,2, and Alex M. Bronstein1,3 1Department of Computer Science, Technion 2Department of Electrical and Computer Engineering, Technion 3Sibylla, UK |
| Pseudocode | No | The paper describes methods and procedures in text but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release a feature-rich, well-tested python package vqr1, implementing estimation of vector quantiles, vector ranks, vector quantile contours, linear and nonlinear VQR, and VMR. To the best of our knowledge, this would be the first publicly available tool for estimating conditional vector quantile functions at scale.1can be installed with pip install vqr; source available at https://github.com/vistalab-technion/vqr |
| Open Datasets | Yes | We use four synthetic and four real datasets which are detailed in Appendices D.1 and D.2. The original real datasets contained one-dimensional targets. Feldman et al. (2021) constructed an additional target variable by selecting a feature that has high correlation with first target variable and small correlation to the other input features, so that it is hard to predict. Summary of these datasets is presented in table A1. |
| Dataset Splits | Yes | In all the real data experiments, we randomly split the data into 80% training set and 20% hold-out test set. |
| Hardware Specification | Yes | All experiments were run on a machine with an Intel Xeon E5 CPU, 256GB of RAM and an Nvidia Titan 2080Ti GPU with 11GB dedicated graphics memory. |
| Software Dependencies | No | The paper mentions using Python packages like 'vqr', 'scipy' with 'qhull', 'pykeops', and 'POT library', but it does not specify exact version numbers for these software dependencies. |
| Experiment Setup | Yes | Synthetic glasses experiment: We set N = 10k, T = 100, and ε = 0.001. We optimized both VQR and NL-VQR for 40k iterations and use a learning rate scheduler that decays the learning rate by a factor of 0.9 every 500 iterations if the error does not drop by 0.5%. Conditional Banana and Rotating Star experiments: We set ε = 0.005 and optimized both VQR and NL-VQR for 20k iterations. We used the same learning rate and schedulers as in the synthetic glasses experiment. Real data experiments: All methods were run for 40k iterations, with learning rate set to 0.3 and ε = 0.01. We set T = 50 for NL-VQR and VQR baselines, and T = 100 for separable linear and nonlinear QR baselines. |