reproducibilityindex.ai

IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression

Authors: Rianne van den Berg, Alexey A. Gritsenko, Mostafa Dehghani, Casper Kaae Sønderby, Tim Salimans

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments we found that neither stochastic rounding nor replacing the identity function in the straight-through estimator with a soft approximation of the rounding function improved the results. We compare continuous ﬂow models that are trained using the unbiased gradient θL with discrete ﬂow models that are trained using the straight-through gradient estimator gst. Table 1: Compression results in bits per dimension (bpd) for IDF++, hand-designed codecs and other deep density estimators based on normalizing ﬂows, super resolution and variational auto-encoders.
Researcher Affiliation	Industry	Google Research {riannevdberg,agritsenko,dehghani,casperkaae,salimans}@google.com
Pseudocode	No	The paper describes algorithmic steps in text but does not contain formally labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an explicit statement about releasing source code for the described methodology or provide a repository link.
Open Datasets	Yes	The training set of CIFAR-10 consists of 50000 images and the test set contains 10000 images. Image Net-32 and Image Net-64 contain approximately 1250000 train images and 50000 test images. Table 1: Compression results in bits per dimension (bpd) for IDF++, hand-designed codecs and other deep density estimators based on normalizing ﬂows, super resolution and variational auto-encoders. Where available, the bpd according to the model s negative log-likelihood is indicated in parenthesis. Results with a are taken from Townsend et al. (2019a), and those with are taken from Hoogeboom et al. (2019a).
Dataset Splits	Yes	Figure 3 shows the performance of models with the proposed modiﬁcations (Dense Net++) on the validation set (consisting of 20% of the training set) on CIFAR-10 as a function of ﬂows per level, after 300K iterations. To make a fair comparison against other methods like local bits-back coding (LBB) by Ho et al. (2019b) we train our ﬁnal models on the entire training set without holding out part of the training set as a validation set. The training set of CIFAR-10 consists of 50000 images and the test set contains 10000 images.
Hardware Specification	Yes	All experiments were run with 8 NVIDIA V100 GPUs.
Software Dependencies	No	The paper mentions 'Tensor Flow (Abadi et al., 2015)' but does not specify a version number or provide version numbers for other software dependencies.
Experiment Setup	Yes	The model is trained with the Adamax optimizer (Kingma & Ba, 2014) with an exponential learning rate schedule with base learning rate equal to 1 × 10−3 and a linear warmup phase of 10 epochs. See Table 2 for more details on the learning rate decay, the number of levels, the batch size and the number of epochs used for training.