Diversify, Contextualize, and Adapt: Efficient Entropy Modeling for Neural Image Codec
Authors: Jun-Hyuk Kim, Seungeon Kim, Won-Hee Lee, Dokwan Oh
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on popular datasets show that our proposed framework consistently improves rate distortion performance across various bit-rate regions, e.g., 3.73% BD-rate gain over the state-of-the-art baseline on the Kodak dataset. 4 Experiments We use a Py Torch [19] based open-source library and evaluation platform, Compress AI [4], which has been widely used for developing and evaluating neural image codecs. Training. We set our model parameters as follows: C = 320, Cl = 10, Cr = 192, and N = 8. We train our models corresponding six different bit-rates. We use 300,000 images randomly sampled from the Open Images [13] dataset. We construct a batch size of 16 with 256 256 patches randomly cropped from different training images. All models are trained for 100 epochs using the Adam optimizer. The learning rate is set to 10 4 up to 90 epoch, and then decreases to 10 5. We use Py Torch v1.9.0, CUDA v11.1, Cu DNN v8.0.5, and all experiments are conducted using a single NVIDIA A100 GPU. |
| Researcher Affiliation | Industry | Jun-Hyuk Kim Seungeon Kim Won-Hee Lee Dokwan Oh Samsung Advanced Institute of Technology {jh131.kim, se2.kim, why_wh.lee, dokwan.oh}@samsung.com |
| Pseudocode | No | The paper describes the proposed methods and processes in text and figures (e.g., Figure 2 and Figure 3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | Answer: [No] Justification: The code is proprietary. (From Neur IPS Paper Checklist, 'Open access to data and code' section). |
| Open Datasets | Yes | We use 300,000 images randomly sampled from the Open Images [13] dataset. ... [13] I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija, A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun, G. Chechik, D. Cai, Z. Feng, D. Narayanan, and K. Murphy. Open Images: A public dataset for large-scale multi-label and multi-class image classification. Dataset available from https://github.com/openimages, 2017. |
| Dataset Splits | No | We use 300,000 images randomly sampled from the Open Images [13] dataset. We construct a batch size of 16 with 256 256 patches randomly cropped from different training images. All models are trained for 100 epochs using the Adam optimizer. The learning rate is set to 10 4 up to 90 epoch, and then decreases to 10 5. We evaluate our method on the two popular datasets: Kodak [7] and Tecnick [1]. The paper mentions training data and evaluation data, but does not explicitly define training/validation/test splits for the training dataset. |
| Hardware Specification | Yes | We use Py Torch v1.9.0, CUDA v11.1, Cu DNN v8.0.5, and all experiments are conducted using a single NVIDIA A100 GPU. ... Decoding time is measured on the Kodak dataset using a single NVIDIA V100 GPU. |
| Software Dependencies | Yes | We use Py Torch v1.9.0, CUDA v11.1, Cu DNN v8.0.5, and all experiments are conducted using a single NVIDIA A100 GPU. |
| Experiment Setup | Yes | We set our model parameters as follows: C = 320, Cl = 10, Cr = 192, and N = 8. We train our models corresponding six different bit-rates. We use 300,000 images randomly sampled from the Open Images [13] dataset. We construct a batch size of 16 with 256 256 patches randomly cropped from different training images. All models are trained for 100 epochs using the Adam optimizer. The learning rate is set to 10 4 up to 90 epoch, and then decreases to 10 5. |