Typicalness-Aware Learning for Failure Detection
Authors: Yijun Liu, Jiequan Cui, Zhuotao Tian, Senqiao Yang, Qingdong He, Xiaoling Wang, Jingyong Su
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | TAL has been extensively evaluated on benchmark datasets, and the results demonstrate its superiority over existing failure detection methods. |
| Researcher Affiliation | Collaboration | Yijun Liu1 Jiequan Cui2 Zhuotao Tian1 Senqiao Yang3 Qingdong He4 Xiaoling Wang1 Jingyong Su1 {liuyijun}@stu.hit.edu.cn 1Harbin Institute of Technology (Shenzhen) 2Nanyang Technological University 3The Chinese University of Hong Kong 4Tencent Youtu Lab |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided in the paper. |
| Open Source Code | Yes | Code is available at https://github.com/liuyijungoon/TAL. |
| Open Datasets | Yes | Datasets and models. We first evaluate on the small-scale CIFAR-100 [21] dataset with SVHN [11] as its out-of-distribution (OOD) test set. To demonstrate scalability, we further conduct experiments on large-scale Image Net [5] using Res Net-50, with Textures [3] and WILDS [20] serving as OOD data. |
| Dataset Splits | Yes | The original CIFAR100 dataset consists of 50,000 training images, with 5,000 images reserved for validation and the remaining 45,000 images used for training. |
| Hardware Specification | Yes | The models are trained for 200 epochs with a batch size of 256 on a single NVIDIA GeForce RTX 3090 GPU. On Image Net [5], we use the Res Net-50 architecture as our backbone. The models are trained for 90 epochs with an initial learning rate of 0.1 on a single NVIDIA A100. |
| Software Dependencies | No | The paper mentions software components like 'SGD optimizer', 'Cosine Annealing LR scheduler', and 'timm library', but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For experiments on the CIFAR [21], we employ an SGD optimizer with an initial learning rate of 0.1, a momentum of 0.9, and a weight decay of 0.0005. The models are trained for 200 epochs with a batch size of 256 on a single NVIDIA GeForce RTX 3090 GPU. Furthermore, we adopt a Cosine Annealing LR scheduler to adjust the learning rate during training. On Image Net [5], we use the Res Net-50 architecture as our backbone. The models are trained for 90 epochs with an initial learning rate of 0.1 on a single NVIDIA A100. The learning rate is decayed by a factor of 0.1 every 30 epochs. ... where we empirically set Tmax and Tmin to 10 and 100, and they perform well on different benchmarks. |