On the relation between statistical learning and perceptual distances

Authors: Alexander Hepburn, Valero Laparra, Raul Santos-Rodriguez, Johannes Ballé, Jesus Malo

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that good perceptual distances, i.e. distances that are good at predicting human psychophysical responses, are also correlated with image likelihoods obtained using a recent probabilistic image model (Pixel CNN++ (Salimans et al., 2017)). This underlines indirectly the idea that part of the biology is informed by efficient representation, as conjectured by Barlow. ... Experiment. Here we explore the relation of the previously defined distances with the probability distribution of the training data using a compression autoencoder in a toy 2D example. ... Experiment. The TID 2013 dataset (Ponomarenko et al., 2013) contains 25 reference images, x1, and 3000 distorted images, x2, with 24 types and 5 levels of severity of distortion. Also provided is the mean opinion score (MOS) for each distorted image.
Researcher Affiliation Collaboration Alexander Hepburn Engineering Mathematics University of Bristol alex.hepburn@bristol.ac.uk Valero Laparra Image Processing Lab Universitat de Valencia valero.laparra@uv.es Raul Santos-Rodriguez Engineering Mathematics University of Bristol enrsr@bristol.ac.uk Johannes Ballé Google Research jballe@google.com Jesús Malo Image Processing Lab Universitat de Valencia jesus.malo@uv.es
Pseudocode No No explicit pseudocode or algorithm blocks are present in the paper.
Open Source Code Yes For all experiments, networks to compute Dr and Din are pretrained models from Ballé et al. (2016)1. Code taken from https://github.com/tensorflow/compression, under Apache License 2.0.
Open Datasets Yes We take images, x, from the CIFAR-10 dataset (Krizhevsky et al., 2009) which consists of small samples that we can analyze according to the selected image model. ... The TID 2013 dataset (Ponomarenko et al., 2013) contains 25 reference images... ... For images, we modify image contrast of the Kodak dataset (Kodak, 1993) in order to create more and less likely samples under the image distribution (low contrast images are generally more likely (Frazor & Geisler, 2006)). ... Open Images dataset (Krasin et al., 2017)
Dataset Splits No No explicit training/validation/test splits are detailed for the main datasets (TID 2013, CIFAR-10, Kodak). The term 'validation set' appears in the caption of Figure 11 for the 2D example: 'BPP is bits per pixel (the rate) and SSE is sum square errors (distortion) evaluated on a validation set.'
Hardware Specification No No specific hardware specifications (GPU/CPU models, memory, etc.) used for running experiments are mentioned in the paper.
Software Dependencies No No specific version numbers for TensorFlow, Pixel CNN++, or Python/PyTorch etc. are mentioned. The paper references 'A pretrained Pixel CNN++ model' and 'Tensorflow Compression package', but without version numbers.
Experiment Setup Yes Adam optimizer was used, with a learning rate of 0.001, a batch size of 4096 and 500,000 steps where for each step a batch is sampled from the distribution. The large batch size was to account for the heavy tailed distribution used. ... A 3-layer multilayer perceptron (MLP) is used for e and d, where the dimensions through the network are r2 ÝÑ 100 ÝÑ 100 ÝÑ 2s for e and the reverse for d.