On the Out-of-distribution Generalization of Probabilistic Image Modelling

Authors: Mingtian Zhang, Andi Zhang, Steven McDonagh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply the proposed model to OOD detection tasks and achieve state-of-the-art unsupervised OOD detection performance without the introduction of additional data. Additionally, we employ our model to build a new lossless image compressor: Ne LLo C (Neural Local Lossless Compressor) and report state-of-the-art compression rates and model size.We conduct OOD detection experiments using four different dataset-pairs that are considered challenging [36]: Fashion MNIST (ID) vs. MNSIT (OOD); Fashion MNIST (ID) vs. OMNIGLOT (OOD); CIFAR10 (ID) vs. SVHN (OOD); CIFAR10 (ID) vs. Celeb A (OOD).
Researcher Affiliation Collaboration Mingtian Zhang 1,2 Andi Zhang 2,3 Steven Mc Donagh 2 1AI Center, University College London, 2Huawei Noah s Ark Lab, 3 Department of Computer Science and Technology, University of Cambridge
Pseudocode No The paper describes its methods in prose and does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes We provide practical implementations of Ne LLo C with different coders and pre-trained models at https://github.com/zmtomorrow/Ne LLo C.
Open Datasets Yes We fit the model to Fashion MNIST (grayscale) and CIFAR10 (color) training datasets and test using in-distribution (ID) images (respective dataset test images) and additional out-of-distribution (OOD) datasets: MNIST, KMNIST (grayscale) and SVHN, Celeb A3 (color).We train Ne LLo C with horizon length h = 3 on two (training) datasets: CIFAR10 (32 32) and Image Net32 (32 32) and test on the previously introduced test sets, including both ID and OOD data. The paper also lists citations for these datasets, e.g., "[51] Fashion-mnist", "[28] The mnist database", "[10] Imagenet: A large-scale hierarchical image database."
Dataset Splits No The paper mentions training and test datasets but does not provide specific details regarding a separate validation dataset split (e.g., percentages, sample counts, or explicit use of a validation set for hyperparameter tuning).
Hardware Specification No The paper mentions running experiments 'on a CPU' but does not provide specific details on the hardware used, such as GPU models, specific CPU models or generations, or memory amounts.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup Yes Both models use the discretized mixture of logistic distributions [41] with 10 mixtures for the predictive distribution and a Res Net architecture [49, 14]. We use a local horizon length h=3 (kernel size k=7) for both grayscale and color image data.Our Ne LLo C implementation uses the same network backbone as that of our OOD detection experiment (Section 3): a Masked CNN with kernel size k = 2 h + 1 (h is the horizon size) in the first layer and followed by several residual blocks with 1 1 convolution, see Appendix A for the network architecture and training details.In practice; we set α = 10 4 to balance numerical stability and model flexibility. We use K = 10 (mixture components) for all models in the compression task.