Heterogeneous Risk Minimization

Authors: Jiashuo Liu, Zheyuan Hu, Peng Cui, Bo Li, Zheyan Shen

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results validate the effectiveness of our HRM framework. ... Extensive experiments in both synthetic and real-world experiments datasets demonstrate the superiority of HRM in terms of average performance, stability performance as well as worst-case performance under different settings of distributional shifts.
Researcher Affiliation Academia 1Department of Computer Science and Technology, Tsinghua University, Beijing, China; Email: {liujiashuo77, zyhu2001}@gmail.com, cuip@tsinghua.edu.cn, shenzy17@mails.tsinghua.edu.cn. 2School of Economics and Management, Tsinghua University, Beijing, China; Email: libo@sem.tsinghua.edu.cn. Correspondence to: Peng Cui <cuip@tsinghua.edu.cn>.
Pseudocode No The paper describes the algorithm steps in text and mathematical formulations but does not include a formal pseudocode block or an algorithm box.
Open Source Code No The paper does not contain any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes Car Insurance Prediction In this task, we use a real-world dataset for car insurance prediction (Kaggle). ... People Income Prediction In this task we use the Adult dataset (Dua & Graff, 2017) to predict personal income levels as above or below $50,000 per year based on personal details. ... House Price Prediction In this experiment, we use a real-world regression dataset (Kaggle) of house sales prices from King County, USA2.
Dataset Splits Yes In training, we generate sum = 2000 data points, where κ = 95% points from environment e1 with a predefined r and 1 κ = 5% points from e2 with r = 1.1. In testing, we generate data points for 10 environments with r [ 3, 2.7, 2.3, . . . , 2.3, 2.7, 3.0]. ... In training phase, all methods are trained on pooled data including 693 points from environment 1 and 200 from environment 2, and validated on 100 sampled from both.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes For simplicity, we select data points according to a certain variable set Vb Ψ : vi Vb |r| 5 |f(φ ) sign(r) vi| (18) ... In training, we generate sum = 2000 data points, where κ = 95% points from environment e1 with a predefined r and 1 κ = 5% points from e2 with r = 1.1. In testing, we generate data points for 10 environments with r [ 3, 2.7, 2.3, . . . , 2.3, 2.7, 3.0]. β is set to 1.0. We compare our HRM with ERM, DRO, EIIL and IRM for Linear Regression. ... In this experiment, we set β = 0.1 and build 10 environments with varying σ and the dimension of Φ , Ψ , the first three for training and the last seven for testing. We run experiments for 10 times and the averaged results are shown in Table 3.