Mitigating Spurious Correlations via Disagreement Probability
Authors: Hyeonggeun Han, Sehwan Kim, Hyungjun Joo, Sangwoo Hong, Jungwoo Lee
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations on multiple benchmarks demonstrate that DPR achieves state-of-the-art performance over existing baselines that do not use bias labels. |
| Researcher Affiliation | Collaboration | 1ECE & 2Next Quantum, Seoul National University 3Hodoo AI Labs {hygnhan, sehwankim, joohj911, tkddn0606, junglee}@snu.ac.kr |
| Pseudocode | Yes | Algorithm 1: Disagreement Probability based Resampling for debiasing (DPR) |
| Open Source Code | Yes | Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] |
| Open Datasets | Yes | Colored MNIST (C-MNIST) is a synthetic dataset designed for digit classification, comprising ten digits, each spuriously correlated with a specific color. Following the protocols in Ahn et al. [1], we set the ratios of bias-conflicting samples, denoted as ρ, at 0.5%, 1%, and 5% for the training set, and 90% for the unbiased test set. |
| Dataset Splits | Yes | Additionally, we use a 10% of training data as validation data, and an unbiased test set with a bias-conflicting ratio of 90% is employed for performance evaluation. |
| Hardware Specification | Yes | All classification models are trained using an NVIDIA RTX A6000. |
| Software Dependencies | No | The paper mentions optimizers (SGD, Adam, Adam W) and models (BERT) but does not provide specific version numbers for any software dependencies (e.g., library or framework versions like PyTorch 1.9). |
| Experiment Setup | Yes | We train the model for 100 epochs with SGD optimizer, a batch size of 128, a learning rate of 0.02, weight decay of 0.001, momentum of 0.9, and learning rate decay of 0.1 at every 40 steps. |