Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts
Authors: Amrith Setlur, Don Dennis, Benjamin Eysenbach, Aditi Raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 EXPERIMENTS Our experiments aim to evaluate the performance of BR-DRO and compare it with ERM and group shift robustness methods that do not require group annotations for training examples. We conduct empirical analyses along the following axes: (i) worst group performance on datasets that exhibit known spurious correlations; (ii) robustness to random label noise in the training data; (iii) average performance on hybrid covariate shift datasets with unspecified groups; and (iv) accuracy in identifying minority groups. See Appendix B for additional experiments and details3. ... Table 1 compares the average and worst group accuracy for BR-DRO with ERM and four group shift robustness baselines... |
| Researcher Affiliation | Academia | 1 Carnegie Mellon University 2 Stanford University 3 UC Berkeley |
| Pseudocode | Yes | Thus, we provide an algorithm where both learner and adversary optimize BR-DRO iteratively through stochastic gradient ascent/descent (Algorithm 1 in Appendix A.1). |
| Open Source Code | Yes | The code used in our experiments can be found at https://github.com/ars22/bitrate_DRO. |
| Open Datasets | Yes | (i) Waterbirds (Wah et al., 2011) (background is spurious), Celeb A (Liu et al., 2015) (binary gender is spuriously correlated with label blond ); and Civil Comments (WILDS) (Borkan et al., 2019) where the task is to predict toxic texts and there are 16 predefined groups Koh et al. (2021). |
| Dataset Splits | Yes | To tune hyperparameters, like prior work we assume access to a some group annotations on validation set but also get decent performance (on some datasets) with only a balanced validation set (see Appendix B). |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | No | The paper mentions 'Implementation details' and states 'We provide model selection methodology and other details in Appendix B', but does not explicitly provide concrete hyperparameter values, learning rates, batch sizes, or number of epochs in the main text. |