Robustly overfitting latents for flexible neural image compression

Authors: Yura Perugachi Diaz, Arwin Gansekoele, Sandjai Bhulai

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show how our method improves the overall compression performance in terms of the R-D trade-off, compared to its predecessors. Additionally, we show how refinement of the latents with our bestperforming method improves the compression performance on both the Tecnick and CLIC dataset. Our method is deployed for a pre-trained hyperprior and for a more flexible model. Further, we give a detailed analysis of our proposed methods and show that they are less sensitive to hyperparameter choices.
Researcher Affiliation Academia Yura Perugachi-Diaz Vrije Universiteit Amsterdam y.m.perugachidiaz@vu.nl; Arwin Gansekoele Centrum Wiskunde & Informatica awg@cwi.nl; Sandjai Bhulai Vrije Universiteit Amsterdam s.bhulai@vu.nl
Pseudocode No The paper describes algorithms verbally and mathematically (e.g., equations for linear, cosine, SSL probabilities) but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes The code can be retrieved from: https://github.com/yperugachidiaz/flexible_neural_image_compression.
Open Datasets Yes We use two pre-trained hyperprior models to test SGA+, for both models we use the package from Compress AI Bégaint et al. (2020). The first model is similar to the one trained in Yang et al. (2020)... All experiments in this section are run with a more recent hyperprior-based model which is based on the architecture of Cheng et al. (2020)... Further, we show how SSL outperforms baselines in an R-D plot on the Kodak dataset... Additionally, we show how our method generalizes to the Tecnick and CLIC dataset...
Dataset Splits Yes During training, each model is evaluated on the Kodak dataset Kodak.
Hardware Specification Yes We perform our experiments on a single NVIDIA A100 GPU.
Software Dependencies No The paper mentions 'Compress AI Bégaint et al. (2020)' and 'Adam optimizer' but does not specify explicit version numbers for these or other software dependencies.
Experiment Setup Yes We run all experiments with temperature schedule τ(t) = min(exp{ ct}, τmax), where c is the temperature rate determining how fast temperature τ is decreasing over time, t is the number of train steps for the refinement of the latents and τmax (0, 1), that determines how soft the latents start the refining procedure. Additionally, we refine the latents for t = 2000 train iterations, unless specified otherwise... The models were trained with λ = {0.0016, 0.0032, 0.0075, 0.015, 0.03, 0.045}. The channel size is set to N = 128 for the models with λ = {0.0016, 0.0032, 0.0075}, refinement of the latents on Kodak with these models take approximately 21 minutes. For the remaining λ s channel size is set to N = 192 and the refining procedure takes approximately 35 minutes... Refinement of the latents with pre-trained models... use the same optimal learning rate of 0.005 for each method. Refinement of the latents with the models of Cheng et al. (2020) use a 10 times lower learning rate of 0.0005. Following Yang et al. (2020), we use the settings for atanh with temperature rate τmax = 0.5, and for STE we use the smaller learning rate of 0.0001... For SGA+, we use optimal convergence settings, which are a fixed learning rate of 0.0005, and τmax = 1. Experimentally, we find approximately best performance for SLL with a = 2.3.