Idempotent Learned Image Compression with Right-Inverse

Authors: Yanghao Li, Tongda Xu, Yan Wang, Jingjing Liu, Ya-Qin Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show that we achieve state-of-the-art rate-distortion performance among idempotent codecs. Furthermore, our idempotent codec can be extended into near-idempotent codec by relaxing the right-invertibility. And this near-idempotent codec has significantly less quality decay after 50 rounds of re-compression compared with other near-idempotent codecs. 4 Experiments
Researcher Affiliation Academia Institute for AI Industry Research (AIR), Tsinghua University
Pseudocode No The paper contains architectural diagrams and mathematical formulations but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes Furthermore, the source code for reproducing experimental results are provided in supplementary materials.
Open Datasets Yes All the models are trained on the training split of open-images dataset [Kuznetsova et al., 2020], and all the evaluations are conducted on the Kodak dataset [Franzen, 1999].
Dataset Splits No The paper mentions training on 'open-images dataset' and evaluation on 'Kodak dataset', but does not provide specific training/validation/test splits (e.g., percentages or counts) or explicitly mention a validation set split.
Hardware Specification Yes All the experiments are conducted on a computer with AMD EPYC 7742 64-Core Processor and 8 Nivida A30 GPU.
Software Dependencies Yes All the code is implemented based Python 3.9, Pytorch 1.12 and Compress AI [Bégaint et al., 2020].
Experiment Setup Yes Images are randomly cropped to 256 256 for training, and a batch size of 16 is used. All the models are trained using an Adam optimizer. The learning rate is initially set to 10 1, and decays by a factor of 10 when plateaued. We choose four bitrate level accoding to the benchmark setting in [Kim et al., 2020]. Specifically, we set λ = {18, 67, 250, 932} 10 4, and models trained with these λ reaches average bitrates from 0.2-1.5 on Kodak dataset. Following prior works [Ballé et al., 2017, 2018], we use a smaller code channels (192) for lower-bpp points, and use a bigger code channels (320) for higher-bpp points. The learned function f( ) in Eq.10 is implemented with a residual block.