reproducibilityindex.ai

Training OOD Detectors in their Natural Habitats

Authors: Julian Katz-Samuels, Julia B Nakhleh, Robert Nowak, Yixuan Li

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We extensively evaluate our approach on common OOD detection tasks and demonstrate superior performance. Code is available at https: //github.com/jkatzsam/woods_ood. ... We extensively evaluate our approach on common OOD detection tasks and establish state-of-the-art performance. For completeness, we compare with two groups of approaches: (1) trained with only Pin data, and (2) trained with both Pin and an auxiliary dataset. On CIFAR-100, compared to a strong baseline using only Pin, our method outperforms by 48.10% (FPR95) on average. The performance gain precisely demonstrates the advantage of incorporating unlabeled wild data. Our method also outperforms Outlier Exposure (OE) (Hendrycks et al., 2019) by 7.36% in FPR95.
Researcher Affiliation	Academia	1Institute for Foundations of Data Science, University of Wisconsin, Madison 2Department of Computer Sciences, University of Wisconsin, Madison 3Department of Electrical and Computer Engineering, University of Wisconsin, Madison.
Pseudocode	Yes	Algorithm 1 WOODS (Wild OOD detection sans Supervision) 1: Input: θ(1) (1), λ(1) (1) β1, β2, epoch length T, batch size B, learning rate µ1, learning rate µ2, penalty multiplier γ, tol 2: for epoch = 1, 2, . . . do 3: for t = 1, 2, . . . , T 1 do 4: Sample a batch of data, calculate Lbatch β (θ, λ) 5: θ(t+1) (epoch) θ(t) (epoch) µ1 θLbatch β (θ, λ) 6: end for 7: λ(epoch+1) λ(epoch) + µ2 θLβ(θ(T ) (epoch), λ(epoch)) n Pn i=1 Lood(gθ(T ) (epoch)(xi), out) > α + tol then 9: β1 γβ1 10: end if 11: if 1 n Pn i=1 Lcls(fθ(T ) (epoch)(xi), yi) > τ + tol then 12: β2 γβ2 13: end if 14: θ(1) (epoch+1) θ(T ) (epoch) 15: end for
Open Source Code	Yes	Code is available at https: //github.com/jkatzsam/woods_ood.
Open Datasets	Yes	Datasets Following the common benchmarks in literature, we use CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009) as ID datasets (Pin). For OOD test datasets Ptest out, we use a suite of natural image datasets including SVHN (Netzer et al., 2011), Textures (Cimpoi et al., 2014), Places365 (Zhou et al., 2018), LSUN-Crop (Yu et al., 2016), and LSUN-Resize (Yu et al., 2016).
Dataset Splits	Yes	We split CIFAR datasets into two halves: 25,000 images as ID training data, and remainder 25,000 used to create the wild mixture data. ... We use 70% of the SVHN for the mixture training dataset and for the validation dataset . We use the remaining examples for the test set. Of the data for training/validation, we use 30% for validation and the remaining for training.
Hardware Specification	Yes	All training is performed in PyTorch using NVIDIA GeForce RTX 2080 Ti GPUs.
Software Dependencies	No	The paper mentions 'PyTorch' but does not specify a version number for it or any other software component, which is necessary for reproducible software dependencies. The text states: 'All training is performed in PyTorch using NVIDIA GeForce RTX 2080 Ti GPUs.'
Experiment Setup	Yes	The model is optimized using stochastic gradient descent with Nesterov momentum (Duchi et al., 2011). We set the weight decay coefﬁcient to be 0.0005, and momentum to be 0.09. Models are initialized with a model pre-trained on the CIFAR-10/CIFAR-100 data and trained for 100 epochs in the Ptest out = Pout setting and for 50 epochs in the Ptest out = Pout setting. ... The initial learning rate is set to be 0.001 and decayed by a factor of 2 after 50%, 75%, and 90% of the epochs. We use a batch size of 128 and a dropout rate of 0.3. ... For optimization in WOODS, we vary the penalty multiplier γ {1.1, 1.5} and the dual update learning rate µ2 {0.1, 1, 2}. We set tol = 0.05, α = 0.05, and set τ to be twice the loss of the pre-trained model.