Diversify and Disambiguate: Out-of-Distribution Robustness via Disagreement
Authors: Yoonho Lee, Huaxiu Yao, Chelsea Finn
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental evaluation shows improved performance in subpopulation shift and domain generalization settings, demonstrating that Div Dis can scalably adapt to distribution shifts in image and text classification benchmarks. |
| Researcher Affiliation | Academia | Yoonho Lee Stanford University Huaxiu Yao Stanford University Chelsea Finn Stanford University |
| Pseudocode | Yes | Algorithm 1 Div Dis training |
| Open Source Code | Yes | Email: yoonho@cs.stanford.edu. Code is available at https://github.com/yoonholee/Div Dis. |
| Open Datasets | Yes | MNIST (Deng, 2012), SVHN (Netzer et al., 2011), Fashion MNIST (Xiao et al., 2017), and CIFAR (Krizhevsky et al., 2009) datasets |
| Dataset Splits | No | The paper mentions using "held-out data" for hyperparameter tuning and "validation sets" for target data in specific contexts, but it does not provide explicit training/validation/test split percentages or sample counts for the main datasets in the main text. |
| Hardware Specification | No | The paper describes the model architecture and computational complexity but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states, "This implementation is based on the Py Torch Paszke et al. (2019) and einops Rogozhnikov (2022) libraries." However, it does not specify exact version numbers for these software components. |
| Experiment Setup | Yes | Unless stated otherwise, Div Dis uses a network with 2 heads and the DISAMBIGUATE stage uses the active querying strategy with 16 labels, which we found suffices to recover the best between two heads. We closely follow the experimental settings of previous works, and all experimental details including datasets, architectures, and hyperparameters are in the appendix. In particular, the objective function in Section 3.2 mentions λ1 and λ2 as hyperparameters: "The overall objective for the DIVERSIFY stage is a weighted sum with hyperparameters λ1, λ2 R" |