Deep One-Class Classification via Interpolated Gaussian Descriptor
Authors: Yuanhong Chen, Yu Tian, Guansong Pang, Gustavo Carneiro383-392
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In extensive experiments on diverse popular benchmarks, including MNIST, Fashion MNIST, CIFAR10, MVTec AD and two medical datasets, IGD achieves better detection accuracy than current state-of-the-art models. IGD also shows better robustness in problems with small or contaminated training sets. |
| Researcher Affiliation | Academia | 1Australian Institute for Machine Learning, University of Adelaide, Australia 2School of Computing and Information Systems, Singapore Management University, Singapore {yuanhong.chen, yu.tian01, gustavo.carneiro}@adelaide.edu.au, pangguansong@gmail.com |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'We implement our framework using Pytorch.' but does not include an explicit statement about releasing the source code or a link to a repository for the described methodology. |
| Open Datasets | Yes | We use four computer vision and two medical image datasets to evaluate our methods. The computer vision datasets are MNIST (Le Cun, Cortes, and Burges 2010), Fashion MNIST (Xiao, Rasul, and Vollgraf 2017), CIFAR10 (Krizhevsky, Nair, and Hinton 2014) and MVTec AD (Bergmann et al. 2019); and the medical image datasets are Hyper-Kvasir (Borgli and et al. 2020) and LAG (Li et al. 2019b). |
| Dataset Splits | Yes | On MNIST, Fashion MNIST and CIFAR10, we use the same protocol as described in (Ruff et al. 2018). CIFAR10 contains 60,000 images with 10 classes. MNIST and Fashion MNIST contain 70,000 images with 10 classes of handwritten digits and fashion products, respectively. MVTec AD (Bergmann et al. 2019) contains 5,354 high-resolution real-world images of 15 different industry object and textures. The normal class of MVTec AD is formed by 3,629 training and 467 testing images without defects. For Hyper-Kvasir, we has 1,600 normal images without polyps in the training set and 500 in the testing set; and 1,000 abnormal images containing polyps in the testing set. For LAG, we have 2,343 normal images without glaucoma in the training set; and 800 normal images and 1,711 abnormal images with glaucoma for testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper mentions 'Pytorch' but does not specify any version numbers for Pytorch or any other software dependencies, making the software environment not fully reproducible. |
| Experiment Setup | Yes | The model was trained with Adam optimiser using a learning rate of 0.0001, weight decay of 10^-6, batch size of 64 images, 256 epochs for all dataset. We defined the representation space produced by the encoder to have Z = 128 dimensions. We set ρ = 0.15 to balance the contribution of MAE and MS-SSIM losses in (12) and (17). We set λ1 = λ2 = 1 in (7) and λ3 = 0.1 in (11), based on cross validation experiments. We use Resnet18 and its reverse architecture as the encoder and decoder for both the global and local IGD models. For this SSL pre-training, we use the SGD optimiser with a learning rate of 0.01, weight decay 10^-1, batch size of 32, and 2,000 epochs. For this Image Net KD pre-training, we use the Adam optimiser with a learning rate of 0.0001, weight decay 10^-5, batch size of 64, and 50,000 iterations. |