Learned Bi-Resolution Image Coding using Generalized Octave Convolutions

Authors: Mohammad Akbari, Jie Liang, Jingning Han, Chengjie Tu6592-6599

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that the proposed scheme outperforms all existing learned methods as well as standard codecs such as the next-generation video coding standard VVC (4:2:0) in both PSNR and MS-SSIM. We also show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based schemes such as semantic segmentation and image denoising.
Researcher Affiliation Collaboration Mohammad Akbari1, Jie Liang1, Jingning Han2, Chengjie Tu3 1 Simon Fraser University, Canada 2 Google Inc., Mountain View 3 Tencent Technologies
Pseudocode No The paper includes architectural diagrams and mathematical formulations but no pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about making its source code publicly available or link to a code repository.
Open Datasets Yes The CLIC training set with images of at least 256 pixels in height or width (1732 images in total) were used for training the proposed model.
Dataset Splits No The paper mentions using the CLIC training set and Kodak for testing, but does not specify details for a validation split.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models) used for experiments.
Software Dependencies No The paper mentions using the Adam solver but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes All models in our framework were jointly trained for 200 epochs with mini-batch stochastic gradient descent and a batch size of 8. The Adam solver with learning rate of 0.00005 was fixed for the first 100 epochs, and was gradually decreased to zero for the next 100 epochs.