reproducibilityindex.ai

Transformer-based Transform Coding

Authors: Yinhao Zhu, Yang Yang, Taco Cohen

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate image compression models on 4 datasets: Kodak (Kodak, 1999), CLIC2021 testset (CLIC, 2021), Tecnick testset (Asuni & Giachetti, 2014), and JPEG-AI testset (JPEG-AI, 2020). ... As can be seen from Figure 3, Swin T transform consistently outperforms its convolutional counterpart; the RD-performance of Swin T-Hyperprior is on-par with Conv-Ch ARM, despite the simpler prior; Swin T-Ch ARM outperforms VTM-12.1 across a wide PSNR range.
Researcher Affiliation	Industry	Yinhao Zhu Yang Yang Taco Cohen Qualcomm AI Research {yinhaoz, yyangy, tacos}@qti.qualcomm.com
Pseudocode	No	The paper includes architectural diagrams (e.g., Figure 2, Figure 10-13) but no formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	Training All image compression models are trained on the CLIC2020 training set. ... For P-frame compression models, we follow the training setup of SSF. Both Conv-SSF and Swin TSSF are trained on Vimeo-90k Dataset (Xue et al., 2019)... We evaluate image compression models on 4 datasets: Kodak (Kodak, 1999), CLIC2021 testset (CLIC, 2021), Tecnick testset (Asuni & Giachetti, 2014), and JPEG-AI testset (JPEG-AI, 2020).
Dataset Splits	No	The paper specifies training sets (CLIC2020, Vimeo-90k) and evaluation test sets (Kodak, CLIC2021, Tecnick, JPEG-AI, UVG, MCL-JCV) but does not explicitly provide details for a separate validation split with specific percentages or sample counts.
Hardware Specification	Yes	The models run with PyTorch 1.9.0 on a workstation with one RTX 2080 Ti GPU. ... evaluated on an Intel Core i9-9940 CPU @ 3.30GHz, averaged over 24 Kodak images. ... same host machine with Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz.
Software Dependencies	Yes	The models run with PyTorch 1.9.0 on a workstation with one RTX 2080 Ti GPU, with PyTorch 1.9.0 and Cuda toolkit 11.1.
Experiment Setup	Yes	Training All image compression models are trained on the CLIC2020 training set. Conv Hyperprior and Swin T-Hyperprior are trained with 2M batches. Each batch contains 8 random 256x256 crops from training images. Learning rate starts at 10-4 and is reduced to 10-5 at 1.8M step. ... To cover a wide range of rate and distortion, for each solution, we train 5 models with β {0.003, 0.001, 0.0003, 0.0001, 0.00003}. ... For P-frame compression models, ... trained on Vimeo-90k Dataset (Xue et al., 2019) for 1M steps with learning rate 10-4, batch size of 8, crop size of 256x256, followed by 50K steps of training with learning rate 10-5 and crop size 384x256.