reproducibilityindex.ai

Out-of-distribution Detection Learning with Unreliable Out-of-distribution Sources

Authors: Haotian Zheng, Qizhou Wang, Zhen Fang, Xiaobo Xia, Feng Liu, Tongliang Liu, Bo Han

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments under various OOD detection setups, demonstrating the effectiveness of our method against its advanced counterparts. This section conducts extensive experiments for ATOL in OOD detection. In Section 5.1, we describe the experiment setup. In Section 5.2, we demonstrate the main results of our method against the data generation-based counterparts on both the CIFAR [30] and the Image Net [8] benchmarks. In Section 5.3, we further conduct ablation studies to comprehensively analyze our method.
Researcher Affiliation	Academia	Haotian Zheng1,2 Qizhou Wang1 Zhen Fang3 Xiaobo Xia4 Feng Liu5 Tongliang Liu4 Bo Han1 1Department of Computer Science, Hong Kong Baptist University 2School of Electronic Engineering, Xidian University 3Australian Artificial Intelligence Institute, University of Technology Sydney 4Sydney AI Centre, The University of Sydney 5School of Computing and Information Systems, The University of Melbourne
Pseudocode	Yes	The pseudo codes are further summarized in Appendix D due to the space limit.
Open Source Code	Yes	The code is publicly available at: https://github.com/tmlr-group/ATOL.
Open Datasets	Yes	For the CIFAR benchmarks, we employ the WRN-40-2 [78] as the backbone model. Following [36], models have been trained for 200 epochs via empirical risk minimization, with a batch size 64, momentum 0.9, and initial learning rate 0.1. ... For the CIFAR cases, we employed Texture [6], SVHN [47], Places365 [84], LSUN-Crop [76], LSUN-Resize [76], and i SUN [74]. For the Image Net case, we employed i Naturalist [18], SUN [74], Places365 [84], and Texture [6].
Dataset Splits	Yes	Hyper-parameters are chosen based on the OOD detection performance on validation datasets.
Hardware Specification	No	No specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments were provided in the paper. The paper mentions computation time but not the hardware it ran on.
Software Dependencies	No	The paper mentions 'Py Torch official repository' and 'Tensor Flow hub' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	For the CIFAR benchmarks, we employ the WRN-40-2 [78] as the backbone model. Following [36], models have been trained for 200 epochs via empirical risk minimization, with a batch size 64, momentum 0.9, and initial learning rate 0.1. The learning rate is divided by 10 after 100 and 150 epochs. ... For the CIFAR benchmarks, ATOL is run for 10 epochs and uses SGD with an initial learning rate 0.01 and the cosine decay [37]. The batch size is 64 for real ID cases, 64 for auxiliary ID cases, and 256 for auxiliary OOD cases. In latent space, the auxiliary ID distribution is the high density region of the Mixture of Gaussian (Mo G). ... For the CIFAR benchmarks, the value of α is set to 1, µ is 5, σ is 0.1, and u is 8.