EVC: Towards Real-Time Neural Image Compression with Mask Decay

Authors: Wang Guo-Hua, Jiahao Li, Bin Li, Yan Lu

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS Training & Testing. The training dataset is the training part of Vimeo-90K (Xue et al., 2019) septuplet dataset. For a fair comparison, all models are trained with 200 epochs. ... Testing datasets contain Kodak (Franzen, 1999), HEVC test sequences (Sharman & Suehring, 2017), Tecnick (Asuni & Giachetti, 2014). BD-Rate (Bjontegaard, 2001) for peak signal-to-noise ratio (PSNR) versus bits-per-pixel (BPP) is our main metric to compare different models.
Researcher Affiliation Collaboration Guo-Hua Wang1 , Jiahao Li2, Bin Li2, Yan Lu2 1State Key Laboratory for Novel Software Technology, Nanjing University 2Microsoft Research Asia wangguohua@lamda.nju.edu.cn, {li.jiahao,libin,yanlu}@microsoft.com
Pseudocode No The paper describes algorithms and methods but does not present any formal pseudocode blocks or sections labeled “Algorithm”.
Open Source Code Yes Our code is at https://github.com/microsoft/DCVC.
Open Datasets Yes The training dataset is the training part of Vimeo-90K (Xue et al., 2019) septuplet dataset. ... Testing datasets contain Kodak (Franzen, 1999), HEVC test sequences (Sharman & Suehring, 2017), Tecnick (Asuni & Giachetti, 2014).
Dataset Splits No The paper states: “During training, the raw image is randomly cropped into 256 256 patches. For a fair comparison, all models are trained with 200 epochs.” It does not explicitly mention “validation” splits with percentages or counts, or specific pre-defined validation sets. It talks about “Training dataset” and “Testing datasets”. While it uses “finetuned for a few epochs” after mask decay, it doesn't specify a validation split for this.
Hardware Specification Yes We test the latency on two computers. One is equipped with one 2080Ti GPU and two Intel(R) Xeon(R) E5-2630 v3 CPUs. Another is equipped with one A100 GPU and one AMD Epyc 7V13 CPU. ... All models are trained on a computer with 8 V100 GPUs.
Software Dependencies Yes We use Compress AI-1.1.02 to gather evaluation results. We install these two softwares according to the official instructions. The default configuration of VTM is used. All results are gathered by running the following command: python m compressai.utils.bench vtm [path to image folder] c [path to VTM folder]/cfg/encoder intra vtm.cfg b [path to VTM folder]/bin j 8 q 24 26 28 30 32 34 36 38 40 42
Experiment Setup Yes The training dataset is the training part of Vimeo-90K (Xue et al., 2019) septuplet dataset. During training, the raw image is randomly cropped into 256 256 patches. For a fair comparison, all models are trained with 200 epochs. For our method, it cost 60 epochs for the teacher to transform into the student with mask decay, then the student is finetuned by 140 epochs. Adam W is used as the optimizer with batch size 16. The initial learning rate is 2e 4, and decays by 0.5 at 50, 90, 130, 170 epochs. The default decay rate η for our mask decay is 4e 5.