Variable Rate Image Compression with Recurrent Neural Networks

Authors: George Toderici, Sean O'Malley, Damien Vincent, Sung Jin Hwang, Michele Covell, Shumeet Baluja, Rahul Sukthankar, David Minnen

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On a large-scale benchmark of 32 32 thumbnails, our LSTM-based approaches provide better visual quality than (headerless) JPEG, JPEG2000 and Web P, with a storage size that is reduced by 10% or more. 4 EXPERIMENTS & ANALYSIS
Researcher Affiliation Industry George Toderici, Sean M. O Malley, Sung Jin Hwang, Damien Vincent {gtoderici, smo, sjhwang, damienv}@google.com David Minnen, Shumeet Baluja, Michele Covell & Rahul Sukthankar {dminnen, shumeet, covell, sukthankar}@google.com Google Mountain View, CA, USA
Pseudocode No The paper includes architectural diagrams (Figure 1, 2, 3) but does not provide formal pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any information or links regarding the availability of open-source code for the described methodology.
Open Datasets No Our 32 32 benchmark dataset contains 216 million random color images collected from the public internet. The paper describes the creation of a custom dataset but does not provide a direct link, DOI, specific repository, or citation to an existing public dataset for access.
Dataset Splits No For training the LSTM models, 90% of the images were used; the remaining 10% were set aside for evaluation. The paper mentions training and evaluation splits but does not explicitly specify a distinct validation set with its proportion or count.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No The paper mentions the use of the Adam algorithm for training but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes In order to train the various neural network configurations, we used the Adam algorithm proposed by Kingma & Ba (2014). We experimented with learning rates of {0.1, 0.3, 0.5, 0.8, 1}. The L2 loss was normalized by the number of pixels in the patch and also by the number of total time steps (i.e., number of iterations unrolled) needed to fully encode the patch. ...We experimented with the number of steps needed to encode each patch, varying this from 8 to 16.