Geometry-Calibrated DRO: Combating Over-Pessimism with Free Energy Implications

Authors: Jiashuo Liu, Jiayun Wu, Tianyu Wang, Hao Zou, Bo Li, Peng Cui

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments confirm GCDRO s superiority over conventional DRO methods. In this section, we test the empirical performances of our proposed GCDRO on simulation data and real-world regression datasets with natural distribution shifts.
Researcher Affiliation Collaboration 1Department of Computer Science and Technology, Tsinghua University 2Department of Industrial Engineering and Operations Research, Columbia University 3Zhongguancun Lab 4School of Economics and Management, Tsinghua University.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement or link indicating the release of its own source code for the described methodology.
Open Datasets Yes Datasets. (1) Bike-sharing dataset (Dua & Graff, 2017)... (2) House Price dataset1... 1https://www.kaggle.com/c/house-prices-advancedregressiontechniques/data (3) Temperature dataset (Dua & Graff, 2017)... URL http://archive.ics.uci.edu/ml.
Dataset Splits Yes In training, we generate 9,500 points with r = 1.9 (majority, strong positive spurious correlation V -Y ) and 500 points with r = 1.3 (minority, weak negative spurious correlation V -Y ). In practice, we do a grid search over α [0.1, 10] on an independent held-out validation dataset to select the best α.
Hardware Specification No The paper mentions 'GPU' generally for parallelization but does not specify any particular GPU model, CPU, or other hardware components used for experiments.
Software Dependencies No The paper mentions software like 'Py Torch' and 'DGL package (Wang et al., 2019)' but does not provide specific version numbers for these dependencies.
Experiment Setup Yes For all these experiments, we use a two-layer MLP model with mean square error (MSE). We use the Adam optimizer (Kingma & Ba, 2015) with the default learning rate 1e 3. And all methods are trained for 5e3 epochs. The hyper-parameter search space is specified in Appendix.