Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression
Authors: Xi Zhang, Xiaolin Wu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experimentation on standard benchmark datasets, we demonstrate the effectiveness of our approach in significantly improving the compression performance of DNN-based image compression systems. Our method outperforms existing DNN quantization schemes in terms of both rate-distortion performance and computational complexity, increasing the cost-effectiveness of DNN image compression models. |
| Researcher Affiliation | Academia | Xi Zhang1 Xiaolin Wu2 1Department of Electronic Engineering, Shanghai Jiao Tong University 2School of Computing and Artificial Intelligence, Southwest Jiaotong University xzhang9308@gmail.com, xlw@swjtu.edu.cn |
| Pseudocode | Yes | A.4.1 Mathematical Formulation: Given a lattice Λ with a basis B = [b1, b2, . . . , bn], we want to find a lattice point v Rn that is close to a target vector t Rn. Babai s rounding algorithm involves the following steps: 1. Compute the Gram-Schmidt orthogonalization of the basis B, resulting in an orthogonal basis B = [b 1, b 2, . . . , b n]. 2. Express the target vector t in terms of the orthogonal basis B : where ci are the coordinates of t in the Gram-Schmidt basis B . 3. Round each coordinate ci to the nearest integer: c = (round(c1), round(c2), . . . , round(cn)) 4. Construct the approximate lattice point: i=1 round(ci)bi |
| Open Source Code | Yes | The source code will be available before the Neur IPS conference. |
| Open Datasets | Yes | The training dataset comprises high-quality images carefully selected from the Image Net dataset [15]. The trained compression models are evaluated on two widely used datasets: the Kodak dataset [14] and the CLIC validation set [8]. |
| Dataset Splits | No | The paper mentions training on the ImageNet dataset and evaluation on the Kodak dataset and CLIC validation set, but it does not specify explicit training/validation/test splits (e.g., percentages or counts) for the datasets used in the experiments. |
| Hardware Specification | Yes | All experiments are conducted with four RTX 3090 GPUs. |
| Software Dependencies | No | All modules including the proposed learnable lattice vector quantization modules are implemented in Py Torch and Compress AI [9]. The paper mentions the software used (PyTorch and Compress AI) but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Each network and context model combination is trained for 3 million iterations. We train each model using the Adam optimizer with β 1 = 0.9, β2 = 0.999. The initial learning rate is set to 10⁻⁴ for the first 2M iterations, and then decayed to 10⁻⁵ for another 1M iterations training. Training images are then random-cropped to 256 × 256 and batched into 16. |