Breaking Correlation Shift via Conditional Invariant Regularizer
Authors: Mingyang Yi, Ruoyu Wang, Jiacheng Sun, Zhenguo Li, Zhi-Ming Ma
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical results verify our algorithm s efficacy in improving OOD generalization. Concretely, we conduct experiments on benchmark classification datasets Celeb A (Liu et al., 2015), Waterbirds (Sagawa et al., 2019), Multi NLI (Williams et al., 2018), and Civil Comments (Borkan et al., 2019). Empirical results show that our algorithm consistently improves the model s generalization on OOD data with correlation shifts. |
| Researcher Affiliation | Collaboration | Mingyang Yi1,2,3, Ruoyu Wang1,2, Jiacheng Sun3, Zhenguo Li3, Zhi-Ming Ma1,2 1University of Chinese Academy of Sciences {yimingyang17,wangruoyu17}@mails.ucas.edu.cn 2Academy of Mathematics and Systems Science, Chinese Academy of Sciences mazm@amt.ac.cn 3Huawei Noah s Ark Lab {sunjiacheng1,li.zhenguo}@huawei.com |
| Pseudocode | Yes | Algorithm 1 Regularize training with CSV. Input: Training set {(xi, yi)}n i=1, number of labels Ky and spurious attributes Kz, training steps T, model fθ( ) parameterized by θ. Initialized θ0, {F k 0}. Positive regularization constant λ, surrogate constant ρ, and correction constant γ. Estimators ˆRemp(fθ, P) to Remp(fθ, P), ˆF k(θ) to F k(θ). |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Concretely, we conduct experiments on benchmark classification datasets Celeb A (Liu et al., 2015), Waterbirds (Sagawa et al., 2019), Multi NLI (Williams et al., 2018), and Civil Comments (Borkan et al., 2019). |
| Dataset Splits | Yes | The numbers of samples in training and test dataset from the 4 groups are respectively {71629, 9767}, {66874, 7535}, {22880, 2880}, {1387, 180}. Our goal is to train a model that correctly recognizes the hair color of celebrities independent of their gender. |
| Hardware Specification | No | The paper mentions using "Res Net-50 pre-trained on Image Net" and "pre-trained BERT Base model" as backbone models but does not specify the hardware (e.g., GPU/CPU models, memory) used for their experiments. |
| Software Dependencies | No | The paper mentions optimizers like "Adam W", "Adam", and "SGD" but does not specify software dependencies with version numbers (e.g., PyTorch 1.x, Python 3.x). |
| Experiment Setup | Yes | The hyperparameters are in Appendix G.4. ... The hyperparameters of the proposed RCSV and RCSVU on Celeb A, Waterbirds, Multi NLI, Civil Comments, Toy example and C-MNIST respectively summarized in Table 10, 11, 12, 13, 14, and 15. |