Rate Distortion For Model Compression:From Theory To Practice

Authors: Weihao Gao, Yu-Han Liu, Chong Wang, Sewoong Oh

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we show that the proposed scheme improves upon the baseline in the compression-accuracy tradeoff.In Section 6, we demonstrate the empirical performance of the proposed objective on fully-connected neural networks on MNIST dataset and convolutional networks on CIFAR dataset.
Researcher Affiliation Collaboration 1Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. Work done as an intern in Google. 2Google, Inc. 3Bytedance, Inc. 4Department of Computer Science, University of Washington.
Pseudocode No The paper refers to 'Algorithm 1 in Appendix A', but the pseudocode or algorithm block itself is not present in the provided paper text.
Open Source Code No The paper mentions loading pretrained models from 'https://github.com/aaron-xichen/pytorch-playground' but does not provide any specific link or statement about open-sourcing the code for the methodology described in this paper.
Open Datasets Yes fully-connected neural network on MNIST. Convolutional neural network with 5 convolutional layers and 3 fully connected layers on CIFAR 10 and CIFAR 100.
Dataset Splits No The paper mentions 'validation cross-entropy loss' and 'validation accuracy', implying the use of a validation set, but it does not provide specific details on the dataset splits (percentages, sample counts, or explicit methodology for creating splits).
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running its experiments.
Software Dependencies No The paper implicitly indicates the use of PyTorch by mentioning 'pytorch-playground' for pretrained models, but it does not specify any software names with version numbers or other reproducible software dependencies.
Experiment Setup Yes For pruning experiment, we choose the same compression rate for every convolutional layer and fully-connected layer, and plot the test accuracy and test cross-entropy loss against compression rate. For quantization experiment, we choose the same number of clusters for every convolutional and fully-connected layer. Also we plot the test accuracy and test cross-entropy loss against compression rate. To reduce the variance of estimating the weight importance matrix Iw, we use the temperature scaling method introduced by Guo et al. (2017) to improve model calibration.