Unsupervised Out-of-Distribution Detection with Diffusion Inpainting

Authors: Zhenzhen Liu, Jin Peng Zhou, Yufan Wang, Kilian Q Weinberger

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show through extensive experiments that LMD achieves competitive performance across a broad variety of datasets. Code can be found at https://github. com/zhenzhel/lift_map_detect. 4. Experiments
Researcher Affiliation Academia 1Department of Computer Science, Cornell University, Ithaca, New York, USA.
Pseudocode Yes Algorithm 1 Inpaint; Algorithm 2 Lift, Map, Detect (LMD)
Open Source Code Yes Code can be found at https://github. com/zhenzhel/lift_map_detect.
Open Datasets Yes We perform OOD detection pairwise among CIFAR10 (Krizhevsky, 2009), CIFAR100 (Krizhevsky, 2009) and SVHN (Netzer et al., 2011), and pairwise among MNIST (Le Cun et al., 2010), KMNIST (Clanuwat et al., 2018) and Fashion MNIST (Xiao et al., 2017).
Dataset Splits No The paper states it uses the 'training set' and 'test set' of datasets, and for Celeb A-HQ it notes a lack of train/test split. However, it does not specify explicit training/validation/test splits, percentages, or detail any validation methodology.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions adapting implementations from Song et al. (2020) and the official Git Hub repository of Likelihood Regret, but it does not provide specific version numbers for these or other key software components (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes For experiments in Table 1, we use Song et al. (2020) s pretrained checkpoint for CIFAR10, and we train DMs on the training set of the in-domain dataset for all the other datasets. The inpainting reconstruction is repeated 10 times with alternating checkerboard 8 8 masks (Figure 3). LMD by default sets N = 8; ablation study on different mask choices can be found in Table 2. We randomly sample a subset of size 100 from each dataset, and standardize all images to 256 256. In our experiments, we add noise to step t = 500 in each attempt (where T = 1000).