Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Provable Guarantees for Understanding Out-of-Distribution Detection
Authors: Peyman Morteza, Yixuan Li7831-7840
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations demonstrate the competitive performance of the new scoring function. In particular, on CIFAR-100 as in-distribution data, GEM outperforms (Liu et al. 2020) by 16.57% (FPR95). |
| Researcher Affiliation | Academia | University of Wisconsin-Madison EMAIL |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at: https://github.com/PeymanMorteza/GEM |
| Open Datasets | Yes | We use CIFAR-10 and CIFAR-100 (Krizhevsky, Hinton et al. 2009) datasets as in-distribution data. |
| Dataset Splits | Yes | We use the standard split, and train with Wide ResNet architecture (Zagoruyko and Komodakis 2016) with depth 40. |
| Hardware Specification | No | The paper does not specify any details about the hardware used for running experiments. |
| Software Dependencies | No | The paper mentions using 'Wide ResNet architecture' but does not specify software dependencies with version numbers (e.g., deep learning framework and its version). |
| Experiment Setup | No | The paper states training with 'Wide Res Net architecture with depth 40' but lacks specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed optimizer settings. |