Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression

Authors: Xi Zhang, Xiaolin Wu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experimentation on standard benchmark datasets, we demonstrate the effectiveness of our approach in significantly improving the compression performance of DNN-based image compression systems. Our method outperforms existing DNN quantization schemes in terms of both rate-distortion performance and computational complexity, increasing the cost-effectiveness of DNN image compression models.
Researcher Affiliation Academia Xi Zhang1 Xiaolin Wu2 1Department of Electronic Engineering, Shanghai Jiao Tong University 2School of Computing and Artificial Intelligence, Southwest Jiaotong University xzhang9308@gmail.com, xlw@swjtu.edu.cn
Pseudocode Yes A.4.1 Mathematical Formulation: Given a lattice Λ with a basis B = [b1, b2, . . . , bn], we want to find a lattice point v Rn that is close to a target vector t Rn. Babai s rounding algorithm involves the following steps: 1. Compute the Gram-Schmidt orthogonalization of the basis B, resulting in an orthogonal basis B = [b 1, b 2, . . . , b n]. 2. Express the target vector t in terms of the orthogonal basis B : where ci are the coordinates of t in the Gram-Schmidt basis B . 3. Round each coordinate ci to the nearest integer: c = (round(c1), round(c2), . . . , round(cn)) 4. Construct the approximate lattice point: i=1 round(ci)bi
Open Source Code Yes The source code will be available before the Neur IPS conference.
Open Datasets Yes The training dataset comprises high-quality images carefully selected from the Image Net dataset [15]. The trained compression models are evaluated on two widely used datasets: the Kodak dataset [14] and the CLIC validation set [8].
Dataset Splits No The paper mentions training on the ImageNet dataset and evaluation on the Kodak dataset and CLIC validation set, but it does not specify explicit training/validation/test splits (e.g., percentages or counts) for the datasets used in the experiments.
Hardware Specification Yes All experiments are conducted with four RTX 3090 GPUs.
Software Dependencies No All modules including the proposed learnable lattice vector quantization modules are implemented in Py Torch and Compress AI [9]. The paper mentions the software used (PyTorch and Compress AI) but does not provide specific version numbers for these dependencies.
Experiment Setup Yes Each network and context model combination is trained for 3 million iterations. We train each model using the Adam optimizer with β 1 = 0.9, β2 = 0.999. The initial learning rate is set to 10⁻⁴ for the first 2M iterations, and then decayed to 10⁻⁵ for another 1M iterations training. Training images are then random-cropped to 256 × 256 and batched into 16.