Idempotent Learned Image Compression with Right-Inverse
Authors: Yanghao Li, Tongda Xu, Yan Wang, Jingjing Liu, Ya-Qin Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that we achieve state-of-the-art rate-distortion performance among idempotent codecs. Furthermore, our idempotent codec can be extended into near-idempotent codec by relaxing the right-invertibility. And this near-idempotent codec has significantly less quality decay after 50 rounds of re-compression compared with other near-idempotent codecs. 4 Experiments |
| Researcher Affiliation | Academia | Institute for AI Industry Research (AIR), Tsinghua University |
| Pseudocode | No | The paper contains architectural diagrams and mathematical formulations but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Furthermore, the source code for reproducing experimental results are provided in supplementary materials. |
| Open Datasets | Yes | All the models are trained on the training split of open-images dataset [Kuznetsova et al., 2020], and all the evaluations are conducted on the Kodak dataset [Franzen, 1999]. |
| Dataset Splits | No | The paper mentions training on 'open-images dataset' and evaluation on 'Kodak dataset', but does not provide specific training/validation/test splits (e.g., percentages or counts) or explicitly mention a validation set split. |
| Hardware Specification | Yes | All the experiments are conducted on a computer with AMD EPYC 7742 64-Core Processor and 8 Nivida A30 GPU. |
| Software Dependencies | Yes | All the code is implemented based Python 3.9, Pytorch 1.12 and Compress AI [Bégaint et al., 2020]. |
| Experiment Setup | Yes | Images are randomly cropped to 256 256 for training, and a batch size of 16 is used. All the models are trained using an Adam optimizer. The learning rate is initially set to 10 1, and decays by a factor of 10 when plateaued. We choose four bitrate level accoding to the benchmark setting in [Kim et al., 2020]. Specifically, we set λ = {18, 67, 250, 932} 10 4, and models trained with these λ reaches average bitrates from 0.2-1.5 on Kodak dataset. Following prior works [Ballé et al., 2017, 2018], we use a smaller code channels (192) for lower-bpp points, and use a bigger code channels (320) for higher-bpp points. The learned function f( ) in Eq.10 is implemented with a residual block. |