reproducibilityindex.ai

On the Out-of-distribution Generalization of Probabilistic Image Modelling

Authors: Mingtian Zhang, Andi Zhang, Steven McDonagh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply the proposed model to OOD detection tasks and achieve state-of-the-art unsupervised OOD detection performance without the introduction of additional data. Additionally, we employ our model to build a new lossless image compressor: Ne LLo C (Neural Local Lossless Compressor) and report state-of-the-art compression rates and model size.We conduct OOD detection experiments using four different dataset-pairs that are considered challenging [36]: Fashion MNIST (ID) vs. MNSIT (OOD); Fashion MNIST (ID) vs. OMNIGLOT (OOD); CIFAR10 (ID) vs. SVHN (OOD); CIFAR10 (ID) vs. Celeb A (OOD).
Researcher Affiliation	Collaboration	Mingtian Zhang 1,2 Andi Zhang 2,3 Steven Mc Donagh 2 1AI Center, University College London, 2Huawei Noah s Ark Lab, 3 Department of Computer Science and Technology, University of Cambridge
Pseudocode	No	The paper describes its methods in prose and does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We provide practical implementations of Ne LLo C with different coders and pre-trained models at https://github.com/zmtomorrow/Ne LLo C.
Open Datasets	Yes	We ﬁt the model to Fashion MNIST (grayscale) and CIFAR10 (color) training datasets and test using in-distribution (ID) images (respective dataset test images) and additional out-of-distribution (OOD) datasets: MNIST, KMNIST (grayscale) and SVHN, Celeb A3 (color).We train Ne LLo C with horizon length h = 3 on two (training) datasets: CIFAR10 (32 32) and Image Net32 (32 32) and test on the previously introduced test sets, including both ID and OOD data. The paper also lists citations for these datasets, e.g., "[51] Fashion-mnist", "[28] The mnist database", "[10] Imagenet: A large-scale hierarchical image database."
Dataset Splits	No	The paper mentions training and test datasets but does not provide specific details regarding a separate validation dataset split (e.g., percentages, sample counts, or explicit use of a validation set for hyperparameter tuning).
Hardware Specification	No	The paper mentions running experiments 'on a CPU' but does not provide specific details on the hardware used, such as GPU models, specific CPU models or generations, or memory amounts.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup	Yes	Both models use the discretized mixture of logistic distributions [41] with 10 mixtures for the predictive distribution and a Res Net architecture [49, 14]. We use a local horizon length h=3 (kernel size k=7) for both grayscale and color image data.Our Ne LLo C implementation uses the same network backbone as that of our OOD detection experiment (Section 3): a Masked CNN with kernel size k = 2 h + 1 (h is the horizon size) in the ﬁrst layer and followed by several residual blocks with 1 1 convolution, see Appendix A for the network architecture and training details.In practice; we set α = 10 4 to balance numerical stability and model ﬂexibility. We use K = 10 (mixture components) for all models in the compression task.