reproducibilityindex.ai

Practical lossless compression with latent variables using bits back coding

Authors: James Townsend, Thomas Bird, David Barber

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate this scheme by using it to compress the MNIST dataset with a variational autoencoder model (VAE), achieving compression rates superior to standard methods with only a simple VAE. Given that the scheme is highly amenable to parallelization, we conclude that with a sufﬁciently high quality generative model this scheme could be used to achieve substantial improvements in compression rate with acceptable running time. We make our implementation available open source at https://github.com/bits-back/bits-back.
Researcher Affiliation	Academia	James Townsend, Thomas Bird & David Barber Department of Computer Science University College London {james.townsend,thomas.bird,david.barber}@cs.ucl.ac.uk
Pseudocode	Yes	Figure 5 shows code implementing BB-ANS encoding ( append ) and decode ( pop ) methods. Since the message is stack-like, we use the Pythonic names append and pop for encoding and decoding respectively.
Open Source Code	Yes	We make our implementation available open source at https://github.com/bits-back/bits-back.
Open Datasets	Yes	We consider the task of compressing the MNIST dataset (Le Cun et al., 1998).
Dataset Splits	No	No, the paper only mentions training and testing sets, not an explicit validation set split or its details. It states: "We ﬁrst train a VAE on the training set and then compress the test using BB-ANS with the trained VAE."
Hardware Specification	No	No, it only mentions "CPU" without any specific details like model, brand, or core count. The text states: "Our current implementation of BB-ANS is written in pure Python, is not parallelized and executes entirely on CPU."
Software Dependencies	No	No, it mentions "Python" but no specific version and no other software dependencies with versions. The text states: "A simple Python implementation of both the encoder and decoder of BB-ANS is given in Appendix C."
Experiment Setup	Yes	For binarized MNIST the generative and recognition networks each have a single deterministic hidden layer of dimension 100, with a stochastic latent of dimension 40. For the full (non-binarized) MNIST dataset each network has one deterministic hidden layer of dimension 200 with a stochastic latent of dimension 50. The output distributions on pixels are modelled by a beta-binomial distribution, which is a two parameter discrete distribution. The generative network outputs the two beta-binomial parameters for each pixel. Instead of directly sampling the ﬁrst latents at random, to simplify our implementation we instead initialize the BB-ANS chain with a supply of clean bits. We ﬁnd that around 400 bits are required for this in our experiments.