DORO: Distributional and Outlier Robust Optimization
Authors: Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct large-scale experiments on modern datasets. Our results show that DORO improves the performance and stability of DRO. |
| Researcher Affiliation | Academia | 1School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA. Correspondence to: Runtian Zhai <rzhai@cmu.edu>. |
| Pseudocode | Yes | Algorithm 1 DORO with Dβ Divergence Input: Batch size n, outlier fraction ϵ, minimal group size α for each iteration do Sample a batch z1, , zn Ptrain Compute losses: ℓi = ℓ(θ, zi) for i = 1, , n Sort the losses: ℓi1 ℓin Find η = arg minη F(θ, η) where F(θ, η) = cβ(ρ) [ 1 n ϵn Pn j= ϵn +1(ℓ(θ; zij) η)β + ]1/β + η Update θ by one step to minimize ℓ(θ) = F(θ, η ) with some gradient method end for |
| Open Source Code | Yes | Codes are available at https://github.com/Runtian Z/doro. |
| Open Datasets | Yes | We conduct large-scale experiments on three datasets: the tabular dataset COMPAS, the vision dataset Celeb A, and the language dataset Civil Comments-Wilds. ... We summarize the datasets we use as follows: (i) COMPAS (Larson et al., 2016): recidivism prediction... (ii) Celeb A (Liu et al., 2015): human face recognition... (iii) Civil Comments-Wilds (Borkan et al., 2019; Koh et al., 2020): toxicity identification... |
| Dataset Splits | Yes | For COMPAS, we randomly sample 70% of the instances to be the training data (with a fixed random seed) and the rest is the validation/testing data. Both Celeb A and Civil Comments-Wilds have official train-validation-test splits, so we use them directly. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) are mentioned in the paper related to the experiments conducted. |
| Software Dependencies | No | No specific software versions (e.g., Python, PyTorch, TensorFlow versions) are mentioned in the paper, only high-level model architectures. |
| Experiment Setup | Yes | Each algorithm is run 300 epochs on COMPAS, 30 epochs on Celeb A and 5 epochs on Civil Comments-Wilds. ... For every DRO and DORO method, we do a grid search to pick the best α and ϵ that achieve the best worst-case accuracy |