Out-of-Distribution Detection using Multiple Semantic Label Representations

Authors: Gabi Shalev, Yossi Adi, Joseph Keshet

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated the proposed model on computer vision, and speech commands detection tasks and compared it to previous methods. Results suggest that our method compares favorably with previous work. Besides, we present the efficiency of our approach for detecting wrongly classified and adversarial examples. In this section, we present our experimental results.
Researcher Affiliation Academia Gabi Shalev Bar-Ilan University, Israel shalev.gabi@gmail.com Yossi Adi Bar-Ilan University, Israel yossiadidrum@gmail.com Joseph Keshet Bar-Ilan University, Israel jkeshet@cs.biu.ac.il
Pseudocode No The paper describes the method in text and provides a schematic diagram (Figure 1), but it does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes We implemented the code using Py Torch [35]. It will be available under www.github.com/MLSpeech/semantic_OOD.
Open Datasets Yes We evaluated our approach on CIFAR-10, CIFAR-100 [18] and Google Speech Commands Dataset1, abbreviated here as GCommands. For CIFAR-10 and CIFAR-100 we trained Res Net-18 and Res Net-34 models [12], respectively, using stochastic gradient descent with momentum for 180 epochs. We used the standard normalization and data augmentation techniques. For out-of-distribution examples, we followed a similar setting as in [25, 13] and evaluated our models on several different datasets. All visual models were trained on CIFAR-10 and tested on SVHN[32], LSUN[42] (resized to 32x32x3) and CIFAR-100; and trained on CIFAR-100 and were tested on SVHN, LSUN (resized to 32x32x3) and CIFAR-10.
Dataset Splits No The paper mentions training and test sets but does not explicitly provide details about a separate validation split, its percentages, or how it was derived.
Hardware Specification No The paper does not explicitly describe the hardware (e.g., specific GPU or CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions 'Py Torch [35]' but does not provide a specific version number for this or any other software dependency.
Experiment Setup Yes In our setting, we used state-of-the-art known architectures as the shared part, and three fully-connected layers, with Re LU activation function between the first two, as the exclusive part. We evaluated our approach on CIFAR-10, CIFAR-100 [18] and Google Speech Commands Dataset1, abbreviated here as GCommands. For CIFAR-10 and CIFAR-100 we trained Res Net-18 and Res Net-34 models [12], respectively, using stochastic gradient descent with momentum for 180 epochs. We used the standard normalization and data augmentation techniques. We used learning rate of 0.1, momentum value of 0.9 and weight decay of 0.0005. During training we divided the learning rate by 5 after 60, 120 and 160 epochs. For the GCommands dataset, we trained Le Net model [22] using Adam [16] for 20 epochs using batch size of 100 and a learning rate of 0.001.