Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Authors: Lucas Liebenwein, Alaa Maalouf, Dan Feldman, Daniela Rus

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments indicate that our method outperforms existing low-rank compression approaches across a wide range of networks and data sets. 3 Experiments
Researcher Affiliation Academia Lucas Liebenwein MIT CSAIL lucas@csail.mit.edu Alaa Maalouf University of Haifa alaamalouf12@gmail.com Oren Gal University of Haifa orengal@alumni.technion.ac.il Dan Feldman University of Haifa dannyf.post@gmail.com Daniela Rus MIT CSAIL rus@csail.mit.edu
Pseudocode Yes Algorithm 1 ALDS( , CR, nseed) Input: : network parameters; CR: overall compression ratio; nseed: number of random seeds to initialize Output: k1, . . . , k L: number of subspaces for each layer; j1, . . . , j L: desired rank per subspace for each layer
Open Source Code Yes Code: https://github.com/lucaslie/torchprune
Open Datasets Yes We test our compression framework on Res Net20 (He et al., 2016), Dense Net22 (Huang et al., 2017), WRN16-8 (Zagoruyko and Komodakis, 2016), and VGG16 (Simonyan and Zisserman, 2015) on CIFAR10 (Torralba et al., 2008); Res Net18 (He et al., 2016), Alex Net (Krizhevsky et al., 2012), and Mobile Net V2 (Sandler et al., 2018) on Image Net (Russakovsky et al., 2015); and on Deeplab V3 (Chen et al., 2017) with a Res Net50 backbone on Pascal VOC segmentation data (Everingham et al., 2015).
Dataset Splits Yes We train reference networks on CIFAR10, Image Net, and VOC, and then compress and retrain the networks once with r = e for various baseline comparisons and compression ratios.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments in the main text. It refers to supplementary material for compute resources.
Software Dependencies No The paper mentions 'grouped convolutions in Py Torch (Paszke et al., 2017)' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup No The paper describes a unified compress-retrain pipeline with 'e epochs' for training and 'r epochs' for retraining, and mentions 'training hyperparameters from epochs [e r, e]', but it does not provide concrete hyperparameter values or detailed training configurations within the main text, referring to supplementary material instead.