Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications
Authors: Jiashuo Liu, Jiayun Wu, Tianyu Wang, Hao Zou, Bo Li, Peng Cui
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments confirm GCDRO s superiority over conventional DRO methods. In this section, we test the empirical performances of our proposed GCDRO on simulation data and real-world regression datasets with natural distribution shifts. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science and Technology, Tsinghua University 2Department of Industrial Engineering and Operations Research, Columbia University 3Zhongguancun Lab 4School of Economics and Management, Tsinghua University. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement or link indicating the release of its own source code for the described methodology. |
| Open Datasets | Yes | Datasets. (1) Bike-sharing dataset (Dua & Graff, 2017)... (2) House Price dataset1... 1https://www.kaggle.com/c/house-prices-advancedregressiontechniques/data (3) Temperature dataset (Dua & Graff, 2017)... URL http://archive.ics.uci.edu/ml. |
| Dataset Splits | Yes | In training, we generate 9,500 points with r = 1.9 (majority, strong positive spurious correlation V -Y ) and 500 points with r = 1.3 (minority, weak negative spurious correlation V -Y ). In practice, we do a grid search over α [0.1, 10] on an independent held-out validation dataset to select the best α. |
| Hardware Specification | No | The paper mentions 'GPU' generally for parallelization but does not specify any particular GPU model, CPU, or other hardware components used for experiments. |
| Software Dependencies | No | The paper mentions software like 'Py Torch' and 'DGL package (Wang et al., 2019)' but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | For all these experiments, we use a two-layer MLP model with mean square error (MSE). We use the Adam optimizer (Kingma & Ba, 2015) with the default learning rate 1e 3. And all methods are trained for 5e3 epochs. The hyper-parameter search space is specified in Appendix. |