Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Transformer-based Transform Coding
Authors: Yinhao Zhu, Yang Yang, Taco Cohen
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate image compression models on 4 datasets: Kodak (Kodak, 1999), CLIC2021 testset (CLIC, 2021), Tecnick testset (Asuni & Giachetti, 2014), and JPEG-AI testset (JPEG-AI, 2020). ... As can be seen from Figure 3, Swin T transform consistently outperforms its convolutional counterpart; the RD-performance of Swin T-Hyperprior is on-par with Conv-Ch ARM, despite the simpler prior; Swin T-Ch ARM outperforms VTM-12.1 across a wide PSNR range. |
| Researcher Affiliation | Industry | Yinhao Zhu Yang Yang Taco Cohen Qualcomm AI Research EMAIL |
| Pseudocode | No | The paper includes architectural diagrams (e.g., Figure 2, Figure 10-13) but no formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Training All image compression models are trained on the CLIC2020 training set. ... For P-frame compression models, we follow the training setup of SSF. Both Conv-SSF and Swin TSSF are trained on Vimeo-90k Dataset (Xue et al., 2019)... We evaluate image compression models on 4 datasets: Kodak (Kodak, 1999), CLIC2021 testset (CLIC, 2021), Tecnick testset (Asuni & Giachetti, 2014), and JPEG-AI testset (JPEG-AI, 2020). |
| Dataset Splits | No | The paper specifies training sets (CLIC2020, Vimeo-90k) and evaluation test sets (Kodak, CLIC2021, Tecnick, JPEG-AI, UVG, MCL-JCV) but does not explicitly provide details for a separate validation split with specific percentages or sample counts. |
| Hardware Specification | Yes | The models run with PyTorch 1.9.0 on a workstation with one RTX 2080 Ti GPU. ... evaluated on an Intel Core i9-9940 CPU @ 3.30GHz, averaged over 24 Kodak images. ... same host machine with Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz. |
| Software Dependencies | Yes | The models run with PyTorch 1.9.0 on a workstation with one RTX 2080 Ti GPU, with PyTorch 1.9.0 and Cuda toolkit 11.1. |
| Experiment Setup | Yes | Training All image compression models are trained on the CLIC2020 training set. Conv Hyperprior and Swin T-Hyperprior are trained with 2M batches. Each batch contains 8 random 256x256 crops from training images. Learning rate starts at 10-4 and is reduced to 10-5 at 1.8M step. ... To cover a wide range of rate and distortion, for each solution, we train 5 models with β {0.003, 0.001, 0.0003, 0.0001, 0.00003}. ... For P-frame compression models, ... trained on Vimeo-90k Dataset (Xue et al., 2019) for 1M steps with learning rate 10-4, batch size of 8, crop size of 256x256, followed by 50K steps of training with learning rate 10-5 and crop size 384x256. |