Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Stable Prediction with Model Misspecification and Agnostic Distribution Shift
Authors: Kun Kuang, Ruoxuan Xiong, Peng Cui, Susan Athey, Bo Li4485-4492
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments clearly demonstrate that our DWR algorithm can significantly improve the accuracy of parameter estimation and stability of prediction with model misspecification and agnostic distribution shift. Experiments In this section, we check the performance of our algorithm with experiments on both synthetic and real-world datasets. |
| Researcher Affiliation | Academia | 1Zhejiang University 2Tsinghua University 3Stanford University |
| Pseudocode | Yes | Algorithm 1 Decorrelated Weighted Regression algorithm |
| Open Source Code | Yes | The online appendix and supplementary materials are available at http://kunkuang.github.io or https://www.dropbox.com/s/1q0brkc2bnehhfo/paperaaai20-Supplementary. |
| Open Datasets | Yes | We collected air pollutant data and meteorological data from the U.S. EPA s Air Quality System (AQS) database,4 which has been widely used for model evaluation (Yahya et al. 2017; Zhu et al. 2018). https://www.epa.gov/outdoor-air-quality-data |
| Dataset Splits | Yes | we trained all models with data from State 1, validated with data from States 1 to 4, finally tested them on all 10 States. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for experiments. It mentions 'a machine learning problem' but no specific GPU/CPU models or cloud instance types. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | In our experiments, we tuned these parameters with cross validation by grid searching, and each parameter is uniformly varied from {0.01, 0.1, 1, 10, 100}. Initialize parameters W (0) and β(0) |