Robust Learning under Uncertain Test Distributions: Relating Covariate Shift to Model Misspecification
Authors: Junfeng Wen, Chun-Nam Yu, Russell Greiner
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical studies, on UCI datasets and a real-world cancer prognostic prediction dataset, show that our analysis applies, and that our RCSA works effectively. |
| Researcher Affiliation | Collaboration | 1Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8 CANADA 2Bell Labs, Alcatel-Lucent, 600 Mountain Avenue, Murray Hill, NJ 07974 USA |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (e.g., clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, such as a specific repository link, an explicit code release statement, or code in supplementary materials. |
| Open Datasets | Yes | We obtain some classification datasets from UCI repository3. All are binary classification problems. For regression task, we use Auto-mpg dataset, which admits a natural covariate shift scenario, as it contains data collected from 3 different cities. We also have a set of cancer patient survival time data provided by our medical collaborators, containing 1523 uncensored patients with 40 features, including gender, stage of cancer, and various measurements obtained at the time of diagnosis. Table 1 shows the summary of the datasets we used in the experiments. |
| Dataset Splits | Yes | To show how to detect whether a dominant strategy exists with various adversarial sets A, we generate 500 data points uniformly in the interval r 1.5, 2s, which we partition into training and test sets via 10-fold cross validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | We tune the parameter λ via 10-fold cross validation. To construct reasonable adversaries, Gaussian kernel is applied to Eq.(4), setting σ to be the average distance from an instance to its n 5 -nearest neighbour, the bases bj to be the training points and B to be 5. |