reproducibilityindex.ai

Entroformer: A Transformer-based Entropy Model for Learned Image Compression

Authors: Yichen Qian, Xiuyu Sun, Ming Lin, Zhiyu Tan, Rong Jin

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTAL RESULTS We evaluate the effects of our transformer-baed entropy model by calculating the rate distortion (RD) performance. Figure 5 shows the RD curves over the publicly available Kodak dataset (Kodak, 1993) by using peak signal-to-noise ratio (PSNR) as the image quality metric. As shown in the left part, our Entroformer with joint the hyperprior module and the context module outperforms the state-of-the-art CNNs methods by 5.2% and the BPG by 20.5% at low bit rates.
Researcher Affiliation	Industry	Yichen Qian Ming Lin Xiuyu Sun Alibaba Group Alibaba Group Alibaba Group Hangzhou, China Bellevue, WA, 98004, USA Hangzhou, China yichen.qyc@alibaba-inc.com ming.l@alibaba-inc.com xiuyu.sxy@alibaba-inc.com Zhiyu Tan Rong Jin Alibaba Group Alibaba Group Hangzhou, China Bellevue, WA, 98004, USA zhiyu.tzy@alibaba-inc.com jinrong.jr@alibaba-inc.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/damo-cv/entroformer.
Open Datasets	Yes	We choose 14886 images from Open Image (Krasin et al., 2017) as our training data.
Dataset Splits	No	The paper specifies training data and a test set (Kodak dataset) but does not explicitly detail the use or splitting of a separate validation set.
Hardware Specification	Yes	All models are trained for 300 epochs with a batchsize of 16 and a patch size of 384 384 on 16GB Tesla V100 GPU card.
Software Dependencies	No	The paper mentions 'Py Torch(Paszke et al., 2019)' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	We use the Adam optimizer (Kingma & Ba, 2014) with β1 = 0.9, β2 = 0.999, ϵ = 1 10 8, and base learning rate= 1 10 4. When training transformers, it is standard practice to use a warmup phase at the beginning of learning, during which the learning rate increases from zero to its peak value (Vaswani et al., 2017). We use a warmup with 0.05 proportion of the total epochs. And then the learning rate decays stepwise for every 1/5 proportion epochs by a factor of 0.75. Gradient clipping is also helpful in the compression setup, which is set to 1.0. All models are trained for 300 epochs with a batchsize of 16 and a patch size of 384 384 on 16GB Tesla V100 GPU card.