Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Red PANDA: Disambiguating Image Anomaly Detection by Removing Nuisance Factors
Authors: Niv Cohen, Jonathan Kahana, Yedid Hoshen
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS |
| Researcher Affiliation | Academia | Niv Cohen Jonathan Kahana Yedid Hoshen School of Computer Science and Engineering The Hebrew University of Jerusalem, Israel EMAIL |
| Pseudocode | Yes | An algorithm box summarizing the different steps can be found in App.J. |
| Open Source Code | Yes | 1The presented benchmarks are available on github under: https://github.com/Niv C/Red PANDA. |
| Open Datasets | Yes | Cars3D (Reed et al., 2015)., Small Norb (Le Cun et al., 2004)., Edges2Shoes (Isola et al., 2017). |
| Dataset Splits | No | We include only true anomalies and pseudo-anomalies in the test set, and split the normal samples between the training set and the test set (85%/15% train/test split). (No explicit validation split mentioned). |
| Hardware Specification | Yes | The entire project used in total 3000 hours of NVIDIA RTX A5000 GPU (including development, testing, and comparisons). All resources were supplied by a local internal cluster. |
| Software Dependencies | No | The implementation uses the Py Torch and faiss (Johnson et al., 2019) packages. CLIP(Radford et al., 2021) and Sim CLR (Chen et al., 2020) are also mentioned. No specific version numbers are provided for these software components. |
| Experiment Setup | Yes | All images were used in a 64 × 64 resolution. For the contrastive temperature, we use τ = 0.1 for all the datasets. We scale down the loss Lrec by a factor of 0.3. We use 200 training epochs. In each batch we used 32 images from 4 different nuisance classes (a batch size of 128, in total). We used a learning rates of 1 × 10−4 and 3 × 10−4 for the encoder and generator (respectively). We use faiss (Johnson et al., 2019) k NN implementation, using k = 1. |