Focus on the Common Good: Group Distributional Robustness Follows
Authors: Vihari Piratla, Praneeth Netrapalli, Sunita Sarawagi
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that our proposed algorithm matches or achieves better performance compared to strong contemporary baselines including ERM and Group-DRO on standard benchmarks on both minority groups and across all groups. |
| Researcher Affiliation | Collaboration | 1Indian Institute of Technology, Bombay 2Google Research, India |
| Pseudocode | Yes | Algorithm 1 CGD Algorithm |
| Open Source Code | Yes | Our code and datasets can be found at this URL. We release the anonymized implementation of our algorithms publicly at this link, with instructions on running the code and pointers to datasets. |
| Open Datasets | Yes | We evaluated on eight datasets, which include two synthetic datasets with induced spurious correlations: Colored MNIST, Water Birds; two real-world datasets with known spurious correlations: Celeb A, Multi NLI; four WILDS (Koh et al., 2020) datasets with a mix of sub-population and domain shift. (Le Cun & Cortes, 2010) |
| Dataset Splits | Yes | We use standard train-validation-test splits for all the datasets when available. |
| Hardware Specification | No | We gratefully acknowledge the Google s TPU research grant that accelerated our experiments. This mentions TPU but lacks specific models or configurations (e.g., TPU v2, v3, number of cores, memory). |
| Software Dependencies | No | We use the codestack1 released with the WILDS (Koh et al., 2020) dataset as our base implementation. This mentions a codestack but does not provide specific software version numbers within the paper. |
| Experiment Setup | Yes | We search C: the group adjustment parameter for Group-DRO, CGD, and group weighting parameter for ERM-UW, over the range [0, 20]. We pick the best learning rate parameter from {1e-3, 1e5}, weight decay from {1e-4, .1, 1}, use SGD optimizer, and set the batch size to 1024, 128, 64, 32 respectively. |