Learned Distributed Image Compression with Multi-Scale Patch Matching in Feature Domain

Authors: Yujun Huang, Bin Chen, Shiyu Qin, Jiawei Li, Yaowei Wang, Tao Dai, Shu-Tao Xia

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide comprehensive experiments to show that our method achieves the state-of-the-art performance.
Researcher Affiliation Collaboration 1 Tsinghua Shenzhen International Graduate School, Tsinghua University 2 Harbin Institute of Technology, Shenzhen 3 Research Center of Artificial Intelligence, Peng Cheng Laboratory 4 Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies 5 HUAWEI Machine Co., Ltd. Dong Guan 6 Shenzhen University
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described, such as a repository link or an explicit statement about code release.
Open Datasets Yes We conduct experiments on two datasets: KITTI Stereo and KITTI General proposed in (Ayzik and Avidan 2020). KITTI Stereo contains 1578 training pairs and 790 test pairs, which are paired stereo images from the KITTI Stereo 2012 (Geiger, Lenz, and Urtasun 2012) and KITTI Stereo 2015(Menze and Geiger 2015) dataset. KITTI General has 174936 training pairs and 3609 test pairs.
Dataset Splits No The paper mentions training and test pairs but does not specify validation dataset splits (e.g., percentages, counts, or explicit use of a validation set beyond training/testing).
Hardware Specification Yes the experiments are conducted on four Intel(R) Xeon(R) E5-2698 v4 CPUs and eight NVIDIA Tesla V100 GPUs.
Software Dependencies No The paper states, "The proposed MSFDPM is implemented with Py Torch (Paszke et al. 2019)", but it does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes We used a batch size of 1 and the Adam optimizer (Kingma and Ba 2015) with 1 10 4 learning rate. Other hyper-parameters are listed as follows: (i) The number of features, C = 128. (ii) The patch size, B = 16. (iii) The weight for rate-distortion trade-off, λ {0.005, 0.01, 0.02, 0.035, 0.05, 0.07, 0.1}. (iv) The weight for two stages of distortions, α is equal to 0 when training the autoencoder baseline and 1 when training the full model.