reproducibilityindex.ai

Information-Theoretic Understanding of Population Risk Improvement with Model Compression

Authors: Yuheng Bu, Weihao Gao, Shaofeng Zou, Venugopal Veeravalli3300-3307

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we provide some real-world experiments to validate our theoretical assertions and the DRHW K-means algorithm.2 Our experiments include compression of: (i) a three-layer fully connected network on MNIST; and (ii) a convolutional neural network with ﬁve conv layers and three linear layers on CIFAR10.
Researcher Affiliation	Collaboration	Yuheng Bu,1 Weihao Gao,1 Shaofeng Zou,2 Venugopal V. Veeravalli1 1University of Illinois at Urbana-Champaign, Urbana, IL, USA 2University at Buffalo, The State University of New York, Buffalo, NY, USA ... Currently with Bytedance Inc., Bellevue, WA, USA
Pseudocode	No	No structured pseudocode or algorithm blocks are present in the paper.
Open Source Code	Yes	All the codes of our experiments are available at the following link https://github.com/wgao9/weight-quant.
Open Datasets	Yes	Our experiments include compression of: (i) a three-layer fully connected network on MNIST; and (ii) a convolutional neural network with ﬁve conv layers and three linear layers on CIFAR10.
Dataset Splits	No	The paper states: 'We use 10% of the training data to train the model for MNIST, and use 20% of the training data to train the model for CIFAR10.' This specifies a fraction of the training data used for training, but does not provide explicit overall training/validation/test splits for the datasets.
Hardware Specification	No	No specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for experiments are provided in the paper.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup	Yes	Our experiments include compression of: (i) a three-layer fully connected network on MNIST; and (ii) a convolutional neural network with ﬁve conv layers and three linear layers on CIFAR10. ... We use 10% of the training data to train the model for MNIST, and use 20% of the training data to train the model for CIFAR10. For each experiment, we use the same number of clusters for each convolutional layer and fully connected layer. and Diameter-regularized Hessian-weighted K-means with different β on the MNIST dataset with K = 7.