Beyond Mahalanobis Distance for Textual OOD Detection
Authors: Pierre Colombo, Eduardo Dadalto, Guillaume Staerman, Nathan Noiry, Pablo Piantanida
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive numerical experiments involve 51k model configurations, including various checkpoints, seeds, and datasets, and demonstrate that TRUSTED achieves state-of-the-art performances. We conduct extensive numerical experiments and prove that our method improves over SOTA methods. |
| Researcher Affiliation | Collaboration | Pierre Colombo Mathématiques et Informatique pour la Complexité et les Systèmes Centrale Supelec, Université Paris Saclay pierre.colombo@centralesupelec.fr...Nathan Noiry althiqua.io noirynathan@gmail.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 3. We release open-source code and data to ease future research, ensure reproducibility and reduce computation overhead. |
| Open Datasets | Yes | The considered benchmark is composed of three different types of in distribution datasets (referred to as IN-DS) which are used to train the classifiers: sentiment analysis (i.e., SST2 [88] and IMDB [70]), topic classification (i.e., 20Newsgroup [54]) and question answering (i.e., TREC-10 [61]). |
| Dataset Splits | Yes | For splitting we use either the standard split or the one provided by [103]. Notice that after 3k iterations models have converged and no over-fitting is observed even after 20k iterations (i.e., we do not observe an increase in validation loss). |
| Hardware Specification | No | This work was also granted access to the HPC resources of IDRIS under the allocation 2021AP010611665 as well as under the project 2021-101838 made by GENCI. |
| Software Dependencies | No | We trained all models with a dropout rate [89] of 0.2, a batch size of 32, we use ADAMW [55]. |
| Experiment Setup | Yes | We trained all models with a dropout rate [89] of 0.2, a batch size of 32, we use ADAMW [55]. Additionally, the weight decay is set to 0.01, the warmup ratio is set to 0.06 and the learning rate to 10 5. |