Finite-State Autoregressive Entropy Coding for Efficient Learned Lossless Compression
Authors: Yufeng Zhang, Hang Yu, Jianguo Li, Weiyao Lin
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the proposed lossless compression method could improve the compression ratio by up to 6% compared to the baseline, with negligible extra computational time. and 6 EXPERIMENTS 6.1 DATASETS AND METRICS In our experiments, we focused on compressing and decompressing image datasets, specifically CIFAR10 (CF10) (Krizhevsky, 2009) and Image Net32/64 (IN32, IN64) (Deng et al., 2009). |
| Researcher Affiliation | Collaboration | Yufeng Zhang1,2 , Hang Yu2 , Jianguo Li2 , Weiyao Lin1 1Shanghai Jiao Tong University, 2Ant Group |
| Pseudocode | Yes | Algorithm 1 Finite-State Autoregressive t ANS Coding Algorithm. The steps that are different from t ANS are highlighted in red. |
| Open Source Code | Yes | Code is available at https://github.com/alipay/Finite_ State_Autoregressive_Entropy_Coding. |
| Open Datasets | Yes | In our experiments, we focused on compressing and decompressing image datasets, specifically CIFAR10 (CF10) (Krizhevsky, 2009) and Image Net32/64 (IN32, IN64) (Deng et al., 2009). and We calculate the BPD based on the length of the compression bitstream to evaluate practical compression performance. For a qualitative understanding of the results, we present the BPD versus Speed comparison for different methods on CIFAR10 in Figure 1. For complete numerical results, please refer to Appendix E.2. |
| Dataset Splits | No | The paper mentions the datasets used (CIFAR10, ImageNet32/64) and training epochs/batch sizes, but it does not specify the explicit training, validation, and test dataset splits (e.g., percentages or sample counts) used for their experiments. |
| Hardware Specification | Yes | For CPU-based experiments, we utilize a desktop machine equipped with an Intel i7-6800K CPU. For GPU-based model training, we employ a virtual machine on a Kubermaker cluster featuring 8 NVIDIA V100 GPUs. |
| Software Dependencies | No | The paper mentions various libraries and tools used (e.g., zstd, craystack, Pillow, imageio-flif) in footnotes but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | Regarding FSAR, each variable relies on its previous neighboring variables in the spatial domain. Specifically, the latent prior utilizing the Order-1 Markov model is defined as p(yi,j,k|yi,j,k 1), while Order-2 is p(yi,j,k|yi,j 1,k, yi,j,k 1), and Order-3 is p(yi,j,k|yi,j 1,k 1, yi,j 1,k, yi,j,k 1), where i, j, k represent the indices of the channel, height, and width dimensions, respectively. During training, the Markov models are implemented using a 3-layer network consisting of 3 Linear layers and 2 Re LU layers. For the learnable state number, the initial number of states is set to 256, and α = 1.5 is used in the α-entmax method. For STHQ, the temperature for Gumbel-softmax GSτ is set as a constant 0.5. and The training process utilized the Adam optimizer with a learning rate of 10 3. For the CIFAR10 dataset, the training was performed for 1000 epochs, with a batch size of 64 per GPU. Regarding the Image Net32 dataset, the training was conducted for 50 epochs, with a batch size of 64 per GPU. As for Image Net64, the training setup was identical to Image Net32, except that the batch size was set to 16 per GPU. |