Causal Context Adjustment Loss for Learned Image Compression
Authors: Minghao Han, Shiyin Jiang, Shengxi Li, Xin Deng, Mai Xu, Ce Zhu, Shuhang Gu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed compression network on various benchmark datasets, in which our method achieves better rate-distortion trade-offs towards the existing state-of-the-art methods, with more than 20% less compression latency. |
| Researcher Affiliation | Academia | 1University of Electronic Science and Technology of China 2Beihang University |
| Pseudocode | No | The paper describes the method in text and uses architectural diagrams but does not include a formal pseudocode block or an algorithm section. |
| Open Source Code | No | The abstract states 'The code is available here.' However, the NeurIPS Paper Checklist clarifies: 'Since our codes have not been sorted and filed well... we will release them when the codes are sorted well.' |
| Open Datasets | Yes | We follow the previous work [49] and train our models on the Open Images [24] dataset. Open Images Dataset contains 300k images with short edge no less than 256 pixels. For evaluation, three benchmarks, i.e., Kodak image set [22], Tecnick test set [1] , and CLIC professional validation dataset [41], are utilized to evaluate the proposed network. |
| Dataset Splits | No | The paper trains on Open Images and evaluates on Kodak, Tecnick, and CLIC validation datasets. However, it does not provide explicit percentages or sample counts for training/validation/test splits of the Open Images dataset itself, beyond mentioning random cropping for training. |
| Hardware Specification | Yes | Our experiments and evaluations are carried out on Intel Xeon Platinum 8375C and a single Nvidia RTX 4090 graphics card. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' but does not specify any programming languages, libraries, or other software with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We set the channel of latent representation y as 320 and that of hyperprior z is set as 192. We train our network with Adam optimizer. We randomly crop 256 256 sub-blocks from the Open Images dataset [24] with a batch size of 8. We optimize the network with the initial learning rate 1e 4 for 2M steps and then decrease the learning rate to 1e 5 for another 0.4M steps. For the MSE metric, the multipliers λ before rate loss are {0.3, 0.85, 1.8, 3.5, 7, 15}. |