Multivariate Distributionally Robust Convex Regression under Absolute Error Loss

Authors: Jose Blanchet, Peter W. Glynn, Jun Yan, Zhengqing Zhou

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 3 Numerical Experiments In this section we investigate the performance of our estimator bfn,δ, and compare it with the least squares estimator (LSE) of convex regression in [15], as well as the kernel smoothing estimator. We summarize the results in the above table. It is clear that our method outperforms both LSE and LR.
Researcher Affiliation Academia Jose Blanchet Stanford MS&E jose.blanchet@stanford.edu Peter W. Glynn Stanford MS&E glynn@stanford.edu Jun Yan Stanford Statistics junyan65@stanford.edu Zhengqing Zhou Stanford Mathematics zqzhou@stanford.edu
Pseudocode No The paper describes the construction of the DRCR estimator and its associated linear program formulation, but it does not present this as a structured pseudocode or algorithm block.
Open Source Code No The paper does not contain any explicit statement about releasing open-source code for the described methodology or a link to a code repository.
Open Datasets Yes We consider a public dataset from United States Environmental Protection Agency, which was suggested by [17].
Dataset Splits No The paper states 'we randomly split the dataset into a training set with 400 data and a test set with 200 data' for the real dataset, but does not explicitly provide details for a validation set split. For synthetic datasets, it describes generating i.i.d. samples but not a train/test/validation split.
Hardware Specification No The paper does not provide any specific hardware details (e.g., CPU, GPU models, or cloud computing instance types) used for running its experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, specific solvers) used in the experiments.
Experiment Setup Yes We construct our DRCR estimator bfn,δn by taking δn = n 2/d. For the LSE of convex regression, in line with the setting in [3, 15], let c be any numerical constant greater than f , and we consider the class of functions Fc := {f : f is convex, f c}. Given that f = 1, we set c = 10 or 0.8... For some bandwidth hn > 0, we define the kernel regression estimator bkn,hn by bkn,hn(x) = Pn i=1 Yi K( x Xi / hn )/ Pn i=1 K( x Xi / hn ), where K : Rd R denotes the Gaussian kernel with K(x) = (2π) d/2 e x 2/2. We then choose the best bandwidth hn via cross validation. To be specific, we pick hn = Cn 1/(d+4) , and then optimize the choice C via line search... In the experiments, we set d = 5, n {50, 100, 150, 200, 250, 300, 350} and σ = 0.2.