Typicalness-Aware Learning for Failure Detection

Authors: Yijun Liu, Jiequan Cui, Zhuotao Tian, Senqiao Yang, Qingdong He, Xiaoling Wang, Jingyong Su

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental TAL has been extensively evaluated on benchmark datasets, and the results demonstrate its superiority over existing failure detection methods.
Researcher Affiliation Collaboration Yijun Liu1 Jiequan Cui2 Zhuotao Tian1 Senqiao Yang3 Qingdong He4 Xiaoling Wang1 Jingyong Su1 {liuyijun}@stu.hit.edu.cn 1Harbin Institute of Technology (Shenzhen) 2Nanyang Technological University 3The Chinese University of Hong Kong 4Tencent Youtu Lab
Pseudocode No No explicit pseudocode or algorithm blocks are provided in the paper.
Open Source Code Yes Code is available at https://github.com/liuyijungoon/TAL.
Open Datasets Yes Datasets and models. We first evaluate on the small-scale CIFAR-100 [21] dataset with SVHN [11] as its out-of-distribution (OOD) test set. To demonstrate scalability, we further conduct experiments on large-scale Image Net [5] using Res Net-50, with Textures [3] and WILDS [20] serving as OOD data.
Dataset Splits Yes The original CIFAR100 dataset consists of 50,000 training images, with 5,000 images reserved for validation and the remaining 45,000 images used for training.
Hardware Specification Yes The models are trained for 200 epochs with a batch size of 256 on a single NVIDIA GeForce RTX 3090 GPU. On Image Net [5], we use the Res Net-50 architecture as our backbone. The models are trained for 90 epochs with an initial learning rate of 0.1 on a single NVIDIA A100.
Software Dependencies No The paper mentions software components like 'SGD optimizer', 'Cosine Annealing LR scheduler', and 'timm library', but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For experiments on the CIFAR [21], we employ an SGD optimizer with an initial learning rate of 0.1, a momentum of 0.9, and a weight decay of 0.0005. The models are trained for 200 epochs with a batch size of 256 on a single NVIDIA GeForce RTX 3090 GPU. Furthermore, we adopt a Cosine Annealing LR scheduler to adjust the learning rate during training. On Image Net [5], we use the Res Net-50 architecture as our backbone. The models are trained for 90 epochs with an initial learning rate of 0.1 on a single NVIDIA A100. The learning rate is decayed by a factor of 0.1 every 30 epochs. ... where we empirically set Tmax and Tmin to 10 and 100, and they perform well on different benchmarks.