reproducibilityindex.ai

Diversify and Disambiguate: Out-of-Distribution Robustness via Disagreement

Authors: Yoonho Lee, Huaxiu Yao, Chelsea Finn

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental evaluation shows improved performance in subpopulation shift and domain generalization settings, demonstrating that Div Dis can scalably adapt to distribution shifts in image and text classification benchmarks.
Researcher Affiliation	Academia	Yoonho Lee Stanford University Huaxiu Yao Stanford University Chelsea Finn Stanford University
Pseudocode	Yes	Algorithm 1 Div Dis training
Open Source Code	Yes	Email: yoonho@cs.stanford.edu. Code is available at https://github.com/yoonholee/Div Dis.
Open Datasets	Yes	MNIST (Deng, 2012), SVHN (Netzer et al., 2011), Fashion MNIST (Xiao et al., 2017), and CIFAR (Krizhevsky et al., 2009) datasets
Dataset Splits	No	The paper mentions using "held-out data" for hyperparameter tuning and "validation sets" for target data in specific contexts, but it does not provide explicit training/validation/test split percentages or sample counts for the main datasets in the main text.
Hardware Specification	No	The paper describes the model architecture and computational complexity but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper states, "This implementation is based on the Py Torch Paszke et al. (2019) and einops Rogozhnikov (2022) libraries." However, it does not specify exact version numbers for these software components.
Experiment Setup	Yes	Unless stated otherwise, Div Dis uses a network with 2 heads and the DISAMBIGUATE stage uses the active querying strategy with 16 labels, which we found suffices to recover the best between two heads. We closely follow the experimental settings of previous works, and all experimental details including datasets, architectures, and hyperparameters are in the appendix. In particular, the objective function in Section 3.2 mentions λ1 and λ2 as hyperparameters: "The overall objective for the DIVERSIFY stage is a weighted sum with hyperparameters λ1, λ2 R"