High-Fidelity Generative Image Compression
Authors: Fabian Mentzer, George D. Toderici, Michael Tschannen, Eirikur Agustsson
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We extensively study how to combine Generative Adversarial Networks and learned compression to obtain a state-of-the-art generative lossy compression system. In particular, we investigate normalization layers, generator and discriminator architectures, training strategies, as well as perceptual losses. In contrast to previous work, i) we obtain visually pleasing reconstructions that are perceptually similar to the input, ii) we operate in a broad range of bitrates, and iii) our approach can be applied to high-resolution images. We bridge the gap between rate-distortionperception theory and practice by evaluating our approach both quantitatively with various perceptual metrics, and with a user study. |
| Researcher Affiliation | Collaboration | Fabian Mentzer ETH Zürich George Toderici Google Research Michael Tschannen Eirikur Agustsson Google Research |
| Pseudocode | No | The paper describes the architecture and formulation but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | Project page and demo: hific.github.io. This link points to a project page and demo, not explicitly a code repository for the methodology described in the paper. |
| Open Datasets | Yes | We evaluate on three diverse benchmark datasets collected independently of our training set to demonstrate that our method generalizes beyond the training distribution: the widely used Kodak [23] dataset (24 images), as well as the CLIC2020 [46] testset (428 images), and DIV2K [2] validation set (100 images). |
| Dataset Splits | Yes | We evaluate on three diverse benchmark datasets collected independently of our training set to demonstrate that our method generalizes beyond the training distribution: the widely used Kodak [23] dataset (24 images), as well as the CLIC2020 [46] testset (428 images), and DIV2K [2] validation set (100 images). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like "Adam" and architectures like "VGG" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We train all models with Adam [22] for 2 000 000 steps, and initialize our GAN models with weights trained for λ r + d only, which speeds up experiments (compared to training GAN models from scratch) and makes them more controllable. Exact values of all training hyper-parameters are tabulated in Appendix A.6. |