Heterogeneous Risk Minimization
Authors: Jiashuo Liu, Zheyuan Hu, Peng Cui, Bo Li, Zheyan Shen
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results validate the effectiveness of our HRM framework. ... Extensive experiments in both synthetic and real-world experiments datasets demonstrate the superiority of HRM in terms of average performance, stability performance as well as worst-case performance under different settings of distributional shifts. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Technology, Tsinghua University, Beijing, China; Email: {liujiashuo77, zyhu2001}@gmail.com, cuip@tsinghua.edu.cn, shenzy17@mails.tsinghua.edu.cn. 2School of Economics and Management, Tsinghua University, Beijing, China; Email: libo@sem.tsinghua.edu.cn. Correspondence to: Peng Cui <cuip@tsinghua.edu.cn>. |
| Pseudocode | No | The paper describes the algorithm steps in text and mathematical formulations but does not include a formal pseudocode block or an algorithm box. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | Car Insurance Prediction In this task, we use a real-world dataset for car insurance prediction (Kaggle). ... People Income Prediction In this task we use the Adult dataset (Dua & Graff, 2017) to predict personal income levels as above or below $50,000 per year based on personal details. ... House Price Prediction In this experiment, we use a real-world regression dataset (Kaggle) of house sales prices from King County, USA2. |
| Dataset Splits | Yes | In training, we generate sum = 2000 data points, where κ = 95% points from environment e1 with a predefined r and 1 κ = 5% points from e2 with r = 1.1. In testing, we generate data points for 10 environments with r [ 3, 2.7, 2.3, . . . , 2.3, 2.7, 3.0]. ... In training phase, all methods are trained on pooled data including 693 points from environment 1 and 200 from environment 2, and validated on 100 sampled from both. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | For simplicity, we select data points according to a certain variable set Vb Ψ : vi Vb |r| 5 |f(φ ) sign(r) vi| (18) ... In training, we generate sum = 2000 data points, where κ = 95% points from environment e1 with a predefined r and 1 κ = 5% points from e2 with r = 1.1. In testing, we generate data points for 10 environments with r [ 3, 2.7, 2.3, . . . , 2.3, 2.7, 3.0]. β is set to 1.0. We compare our HRM with ERM, DRO, EIIL and IRM for Linear Regression. ... In this experiment, we set β = 0.1 and build 10 environments with varying σ and the dimension of Φ , Ψ , the first three for training and the last seven for testing. We run experiments for 10 times and the averaged results are shown in Table 3. |