Red PANDA: Disambiguating Image Anomaly Detection by Removing Nuisance Factors
Authors: Niv Cohen, Jonathan Kahana, Yedid Hoshen
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENTS |
| Researcher Affiliation | Academia | Niv Cohen Jonathan Kahana Yedid Hoshen School of Computer Science and Engineering The Hebrew University of Jerusalem, Israel nivc@cs.huji.ac.il |
| Pseudocode | Yes | An algorithm box summarizing the different steps can be found in App.J. |
| Open Source Code | Yes | 1The presented benchmarks are available on github under: https://github.com/Niv C/Red PANDA. |
| Open Datasets | Yes | Cars3D (Reed et al., 2015)., Small Norb (Le Cun et al., 2004)., Edges2Shoes (Isola et al., 2017). |
| Dataset Splits | No | We include only true anomalies and pseudo-anomalies in the test set, and split the normal samples between the training set and the test set (85%/15% train/test split). (No explicit validation split mentioned). |
| Hardware Specification | Yes | The entire project used in total 3000 hours of NVIDIA RTX A5000 GPU (including development, testing, and comparisons). All resources were supplied by a local internal cluster. |
| Software Dependencies | No | The implementation uses the Py Torch and faiss (Johnson et al., 2019) packages. CLIP(Radford et al., 2021) and Sim CLR (Chen et al., 2020) are also mentioned. No specific version numbers are provided for these software components. |
| Experiment Setup | Yes | All images were used in a 64 × 64 resolution. For the contrastive temperature, we use τ = 0.1 for all the datasets. We scale down the loss Lrec by a factor of 0.3. We use 200 training epochs. In each batch we used 32 images from 4 different nuisance classes (a batch size of 128, in total). We used a learning rates of 1 × 10−4 and 3 × 10−4 for the encoder and generator (respectively). We use faiss (Johnson et al., 2019) k NN implementation, using k = 1. |