HiLLoC: lossless image compression with hierarchical latent variable models
Authors: James Townsend, Thomas Bird, Julius Kunze, David Barber
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments (Section 4), we demonstrate that Hi LLo C can be used to compress color images from the Image Net test set at rates close to the ELBO, outperforming all of the other codecs which we benchmark. |
| Researcher Affiliation | Academia | James Townsend , Thomas Bird , Julius Kunze & David Barber Department of Computer Science University College London <firstname>.<surname>@cs.ucl.ac.uk |
| Pseudocode | No | The paper contains tables (Table 1, Table 3) that describe encoding and decoding operations in a step-by-step manner. However, these are not formatted as pseudocode or algorithm blocks with typical pseudocode syntax. |
| Open Source Code | Yes | We release Craystack, an open source library for convenient prototyping of lossless compression using probabilistic models, along with full implementations of all of our compression results1. 1Available at https://github.com/hilloc-submission/hilloc. |
| Open Datasets | Yes | We trained the RVAE on the Image Net 32 training set, then evaluated the RVAE ELBO and Hi LLo C compression rate on the Image Net 32 test set. To test generalization, we also evaluated the ELBO and compression rate on the tests sets of Image Net64, CIFAR10 and full size Image Net. |
| Dataset Splits | Yes | We trained the RVAE on the Image Net 32 training set, then evaluated the RVAE ELBO and Hi LLo C compression rate on the Image Net 32 test set. ... All Hi LLo C results are obtained from the same model, trained on Image Net 32. ... The ELBO and compression rate of Hi LLo C with Pixel VAE, trained to convergence on Image Net 64, compared to other schemes. All schemes are evaluated on the Image Net 64 validation set, and measured in bits per pixel-channel. |
| Hardware Specification | Yes | We find that the run times for encoding and decoding are roughly linear in the number of pixels, and the time to compress an average sized Image Net image of 500 374 pixels (with vectorized ANS) is around 29s on a desktop computer with 6 CPU cores and a GTX 1060 GPU. |
| Software Dependencies | No | The paper mentions software like Craystack, Python, and Numpy, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Using Craystack, we implement Hi LLo C with a Res Net VAE (RVAE) (Kingma et al., 2016). This powerful hierarchical latent variable model achieves ELBOs comparable to state of the art autoregressive models2. In all experiments we used an RVAE with 24 stochastic hidden layers. ... Since we retained the default hyperparameters from the original implementation, each latent layer has 32 feature maps and spatial dimensions half those of the input (e.g. h/2 for input of shape h w). |