Soft-to-Hard Vector Quantization for End-to-End Learning Compressible Representations
Authors: Eirikur Agustsson, Fabian Mentzer, Michael Tschannen, Lukas Cavigelli, Radu Timofte, Luca Benini, Luc V. Gool
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present a new approach to learn compressible representations in deep architectures with an end-to-end training strategy. Our method is based on a soft (continuous) relaxation of quantization and entropy, which we anneal to their discrete counterparts throughout training. We showcase this method for two challenging applications: Image compression and neural network compression. While these tasks have typically been approached with different methods, our soft-to-hard quantization approach gives results competitive with the state-of-the-art for both. |
| Researcher Affiliation | Collaboration | Eirikur Agustsson ETH Zurich aeirikur@vision.ee.ethz.ch Fabian Mentzer ETH Zurich mentzerf@vision.ee.ethz.ch Michael Tschannen ETH Zurich michaelt@nari.ee.ethz.ch Lukas Cavigelli ETH Zurich cavigelli@iis.ee.ethz.ch Radu Timofte ETH Zurich & Merantix timofter@vision.ee.ethz.ch Luca Benini ETH Zurich benini@iis.ee.ethz.ch Luc Van Gool KU Leuven & ETH Zurich vangool@vision.ee.ethz.ch |
| Pseudocode | No | The paper describes mathematical formulations and algorithmic steps in prose and equations (e.g., Section 3.2 'Our Method'). However, it does not include any clearly labeled 'Pseudocode' or 'Algorithm' block or figure. |
| Open Source Code | No | The paper does not provide any specific links to source code repositories or explicit statements confirming the release of source code for the described methodology. It does not mention supplementary material containing code for the methodology. |
| Open Datasets | Yes | Our training set is composed similarly to that described in [4]. We used a subset of 90,000 images from Image NET [9]. To evaluate the image compression performance of our Soft-to-Hard Vector Quantization Autoencoder (SHVQ) method we use four datasets, namely Kodak [2], B100 [31], Urban100 [14], Image NET100 (100 randomly selected images from Image NET [25]). For DNN compression, we investigate the Res Net [13] architecture for image classification. We adopt the same setting as [6] and consider a 32-layer architecture trained for CIFAR-10 [18]. |
| Dataset Splits | Yes | Our training set is composed similarly to that described in [4]. We used a subset of 90,000 images from Image NET [9], which we downsampled by a factor 0.7 and trained on crops of 128 x 128 pixels, with a batch size of 15. To estimate the probability distribution p for optimizing (8), we maintain a histogram over 5,000 images, which we update every 10 iterations with the images from the current batch. |
| Hardware Specification | Yes | Our full Image Compression Autoencoder has 6.37M trainable parameters, trained using Adam [17] for 300,000 iterations on an NVIDIA Titan X GPU using tensorflow. |
| Software Dependencies | No | The paper mentions using 'tensorflow' in Appendix A.2, but it does not specify a version number for tensorflow or any other software libraries or dependencies. It only mentions 'Adam [17]' as an optimizer, which is a method, not a specific software package with a version. |
| Experiment Setup | Yes | We trained different models using Adam [17], see Appendix A.2. We used a subset of 90,000 images from Image NET [9], which we downsampled by a factor 0.7 and trained on crops of 128 x 128 pixels, with a batch size of 15. Our full Image Compression Autoencoder... trained using Adam [17] for 300,000 iterations... We used a learning rate schedule of 1e-4 for 250k iterations, then 1e-5 for 50k iterations. We implemented the entropy minimization by using L = 75 centers and chose β = 0.1... The training was performed with the same learning parameters as the original model was trained with (SGD with momentum 0.9). The annealing schedule used was a simple exponential one, σ(t + 1) = 1.001 σ(t) with σ(0) = 0.4. |