The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition

Authors: Jun CEN, Di Luan, Shiwei Zhang, Yixuan Pei, Yingya Zhang, Deli Zhao, Shaojie Shen, Qifeng Chen

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we deeply analyze the UOSR task under different training and evaluation settings to shed light on this promising research direction. For this purpose, we first evaluate the UOSR performance of several OSR methods and show a significant finding that the UOSR performance consistently surpasses the OSR performance by a large margin for the same method. We show that the reason lies in the known but wrongly classified samples, as their uncertainty distribution is extremely close to unknown samples rather than known and correctly classified samples. Second, we analyze how the two training settings of OSR (i.e., pre-training and outlier exposure) influence the UOSR. We find although they are both beneficial for distinguishing known and correctly classified samples from unknown samples, pre-training is also helpful for identifying known but wrongly classified samples while outlier exposure is not. In addition to different training settings, we also formulate a new evaluation setting for UOSR which is called few-shot UOSR, where only one or five samples per unknown class are available during evaluation to help identify unknown samples. We propose FS-KNNS for the few-shot UOSR to achieve state-of-the-art performance under all settings.
Researcher Affiliation Collaboration 1Cheng Kar-Shun Robotics Institute, The Hong Kong University of Science and Technology 2Alibaba Group 3Xi an Jiaotong University {jcenaa,dluan}@connect.ust.hk, {zhangjin.zsw,yingya.zyy,deli.zdl}@alibaba-inc.com, peiyixuan@stu.xjtu.edu.cn, {eeshaojie,cqf}@ust.hk
Pseudocode Yes Therefore, the uncertainty score is ˆufs knn = 1 top K(Strain) + top K(Sref test), (1)
Open Source Code Yes Code: https://github.com/Jun-CEN/Unified_Open_Set_Recognition.
Open Datasets Yes In the image domain, the In D dataset is CIFAR100 (Krizhevsky et al., 2009) and the Oo D datasets are Tiny Image Net (Le & Yang, 2015) and LSUN (Yu et al., 2015).
Dataset Splits Yes The training In D dataset is CIFAR-100, which contains 100 classes with 50000 training images and 10000 test images.
Hardware Specification No The paper mentions using specific network backbones like Res Net50, VGG13, TSM, and I3D, but it does not specify the hardware (e.g., GPU models, CPU types) used to run the experiments.
Software Dependencies No The paper mentions various frameworks and models (e.g., Soft Max, ODIN, Open Max, Res Net50, TSM, Bi T, Image Net, Kinetics400) but does not provide specific version numbers for software dependencies like Python, PyTorch, TensorFlow, CUDA, etc.
Experiment Setup Yes When we train the model from scratch, we find that setting the base learning rate as 0.1 and step-wisely decayed by 10 every 24000 steps with totally 120000 steps can achieve good enough closed-set performance. We use a linear warmup strategy to warmup the training in the first 500 steps. We use SGD with momentum, batch size 128 for all models.