reproducibilityindex.ai

Fast Nonlinear Vector Quantile Regression

Authors: Aviv A. Rosenberg, Sanketh Vedula, Yaniv Romano, Alexander Bronstein

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We use four synthetic and four real datasets which are detailed in Appendices D.1 and D.2. Except for the MVN dataset, which was used for the scale and optimization experiments, the remaining three synthetic datasets were carefully selected to be challenging since they exhibit complex nonlinear relationships between X and Y (see e.g. ﬁg. 1b). We evaluate using the following metrics (detailed in Appendix E): (i) KDE-L1, an estimate of distance between distributions; (ii) QFD, a distance measured between an estimated CVQF and its ground truth; (iii) Inverse CVQF entropy; (iv) Monotonicity violations; (v) Marginal coverage; (vi) Size of α-conﬁdence set.
Researcher Affiliation	Collaboration	Aviv A. Rosenberg1,3, , Sanketh Vedula1,3, , Yaniv Romano1,2, and Alex M. Bronstein1,3 1Department of Computer Science, Technion 2Department of Electrical and Computer Engineering, Technion 3Sibylla, UK
Pseudocode	No	The paper describes methods and procedures in text but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	We release a feature-rich, well-tested python package vqr1, implementing estimation of vector quantiles, vector ranks, vector quantile contours, linear and nonlinear VQR, and VMR. To the best of our knowledge, this would be the ﬁrst publicly available tool for estimating conditional vector quantile functions at scale.1can be installed with pip install vqr; source available at https://github.com/vistalab-technion/vqr
Open Datasets	Yes	We use four synthetic and four real datasets which are detailed in Appendices D.1 and D.2. The original real datasets contained one-dimensional targets. Feldman et al. (2021) constructed an additional target variable by selecting a feature that has high correlation with ﬁrst target variable and small correlation to the other input features, so that it is hard to predict. Summary of these datasets is presented in table A1.
Dataset Splits	Yes	In all the real data experiments, we randomly split the data into 80% training set and 20% hold-out test set.
Hardware Specification	Yes	All experiments were run on a machine with an Intel Xeon E5 CPU, 256GB of RAM and an Nvidia Titan 2080Ti GPU with 11GB dedicated graphics memory.
Software Dependencies	No	The paper mentions using Python packages like 'vqr', 'scipy' with 'qhull', 'pykeops', and 'POT library', but it does not specify exact version numbers for these software dependencies.
Experiment Setup	Yes	Synthetic glasses experiment: We set N = 10k, T = 100, and ε = 0.001. We optimized both VQR and NL-VQR for 40k iterations and use a learning rate scheduler that decays the learning rate by a factor of 0.9 every 500 iterations if the error does not drop by 0.5%. Conditional Banana and Rotating Star experiments: We set ε = 0.005 and optimized both VQR and NL-VQR for 20k iterations. We used the same learning rate and schedulers as in the synthetic glasses experiment. Real data experiments: All methods were run for 40k iterations, with learning rate set to 0.3 and ε = 0.01. We set T = 50 for NL-VQR and VQR baselines, and T = 100 for separable linear and nonlinear QR baselines.