Universally Quantized Neural Compression
Authors: Eirikur Agustsson, Lucas Theis
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments with two models: (a) a simple linear model and (b) a more complex model based on the hyperprior architecture proposed by Ballé et al. [6] and extended by Minnen et al. [23]. We evaluate all models on the Kodak [20] dataset by computing the rate-distortion (RD) curve in terms of bits-per-pixel (bpp) versus peak signal-to-noise ratio (PSNR). |
| Researcher Affiliation | Industry | Eirikur Agustsson Google Research eirikur@google.com Lucas Theis Google Research theis@google.com |
| Pseudocode | No | No pseudocode or algorithm blocks are present. |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the release of source code for the described methodology. |
| Open Datasets | Yes | We evaluate all models on the Kodak [20] dataset by computing the rate-distortion (RD) curve in terms of bits-per-pixel (bpp) versus peak signal-to-noise ratio (PSNR). [20] Kodak. Photo CD PCD0992, 1993. URL http://r0k.us/graphics/kodak/. |
| Dataset Splits | No | The paper does not specify explicit training/validation/test dataset splits. It mentions using "256x256 pixel crops extracted from a set of 1M high resolution JPEG images" for training and evaluating on the "Kodak [20] dataset", but no specific split percentages or counts are provided for these datasets. |
| Hardware Specification | Yes | The training time was about 30 hours for the linear models and about 60 hours for the hyperprior models on an Nvidia V100 GPU. |
| Software Dependencies | No | The paper mentions using the "Adam optimizer [19]" but does not specify any software versions for libraries, frameworks, or programming languages (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | We optimized all models for mean squared error (MSE). The Adam optimizer [19] was applied for 2M steps with a batch size of 8 and a learning rate of 10^-4 which is reduced to 10^-5 after 1.6M steps. For the first 5,000 steps only the density models were trained and the learning rates of the encoder and decoder transforms were kept at zero. For the hyperprior models we set λ = 2i for i ∈ {−6, ..., 1} and decayed it by a factor of 1/10 after 200k steps. For the linear models we use slightly smaller λ = 0.4 * 2i and reduced it by a factor of 1/2 after 100k steps and again after 200k steps. For soft rounding we linearly annealed the parameter from 1 to 16 over the full 2M steps. |