Multivariate Distributionally Robust Convex Regression under Absolute Error Loss
Authors: Jose Blanchet, Peter W. Glynn, Jun Yan, Zhengqing Zhou
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 Numerical Experiments In this section we investigate the performance of our estimator bfn,δ, and compare it with the least squares estimator (LSE) of convex regression in [15], as well as the kernel smoothing estimator. We summarize the results in the above table. It is clear that our method outperforms both LSE and LR. |
| Researcher Affiliation | Academia | Jose Blanchet Stanford MS&E jose.blanchet@stanford.edu Peter W. Glynn Stanford MS&E glynn@stanford.edu Jun Yan Stanford Statistics junyan65@stanford.edu Zhengqing Zhou Stanford Mathematics zqzhou@stanford.edu |
| Pseudocode | No | The paper describes the construction of the DRCR estimator and its associated linear program formulation, but it does not present this as a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing open-source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We consider a public dataset from United States Environmental Protection Agency, which was suggested by [17]. |
| Dataset Splits | No | The paper states 'we randomly split the dataset into a training set with 400 data and a test set with 200 data' for the real dataset, but does not explicitly provide details for a validation set split. For synthetic datasets, it describes generating i.i.d. samples but not a train/test/validation split. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., CPU, GPU models, or cloud computing instance types) used for running its experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, specific solvers) used in the experiments. |
| Experiment Setup | Yes | We construct our DRCR estimator bfn,δn by taking δn = n 2/d. For the LSE of convex regression, in line with the setting in [3, 15], let c be any numerical constant greater than f , and we consider the class of functions Fc := {f : f is convex, f c}. Given that f = 1, we set c = 10 or 0.8... For some bandwidth hn > 0, we define the kernel regression estimator bkn,hn by bkn,hn(x) = Pn i=1 Yi K( x Xi / hn )/ Pn i=1 K( x Xi / hn ), where K : Rd R denotes the Gaussian kernel with K(x) = (2π) d/2 e x 2/2. We then choose the best bandwidth hn via cross validation. To be specific, we pick hn = Cn 1/(d+4) , and then optimize the choice C via line search... In the experiments, we set d = 5, n {50, 100, 150, 200, 250, 300, 350} and σ = 0.2. |