Causal Context Adjustment Loss for Learned Image Compression

Authors: Minghao Han, Shiyin Jiang, Shengxi Li, Xin Deng, Mai Xu, Ce Zhu, Shuhang Gu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our proposed compression network on various benchmark datasets, in which our method achieves better rate-distortion trade-offs towards the existing state-of-the-art methods, with more than 20% less compression latency.
Researcher Affiliation Academia 1University of Electronic Science and Technology of China 2Beihang University
Pseudocode No The paper describes the method in text and uses architectural diagrams but does not include a formal pseudocode block or an algorithm section.
Open Source Code No The abstract states 'The code is available here.' However, the NeurIPS Paper Checklist clarifies: 'Since our codes have not been sorted and filed well... we will release them when the codes are sorted well.'
Open Datasets Yes We follow the previous work [49] and train our models on the Open Images [24] dataset. Open Images Dataset contains 300k images with short edge no less than 256 pixels. For evaluation, three benchmarks, i.e., Kodak image set [22], Tecnick test set [1] , and CLIC professional validation dataset [41], are utilized to evaluate the proposed network.
Dataset Splits No The paper trains on Open Images and evaluates on Kodak, Tecnick, and CLIC validation datasets. However, it does not provide explicit percentages or sample counts for training/validation/test splits of the Open Images dataset itself, beyond mentioning random cropping for training.
Hardware Specification Yes Our experiments and evaluations are carried out on Intel Xeon Platinum 8375C and a single Nvidia RTX 4090 graphics card.
Software Dependencies No The paper mentions using 'Adam optimizer' but does not specify any programming languages, libraries, or other software with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We set the channel of latent representation y as 320 and that of hyperprior z is set as 192. We train our network with Adam optimizer. We randomly crop 256 256 sub-blocks from the Open Images dataset [24] with a batch size of 8. We optimize the network with the initial learning rate 1e 4 for 2M steps and then decrease the learning rate to 1e 5 for another 0.4M steps. For the MSE metric, the multipliers λ before rate loss are {0.3, 0.85, 1.8, 3.5, 7, 15}.