Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition
Authors: Lucas Liebenwein, Alaa Maalouf, Dan Feldman, Daniela Rus
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments indicate that our method outperforms existing low-rank compression approaches across a wide range of networks and data sets. 3 Experiments |
| Researcher Affiliation | Academia | Lucas Liebenwein MIT CSAIL lucas@csail.mit.edu Alaa Maalouf University of Haifa alaamalouf12@gmail.com Oren Gal University of Haifa orengal@alumni.technion.ac.il Dan Feldman University of Haifa dannyf.post@gmail.com Daniela Rus MIT CSAIL rus@csail.mit.edu |
| Pseudocode | Yes | Algorithm 1 ALDS( , CR, nseed) Input: : network parameters; CR: overall compression ratio; nseed: number of random seeds to initialize Output: k1, . . . , k L: number of subspaces for each layer; j1, . . . , j L: desired rank per subspace for each layer |
| Open Source Code | Yes | Code: https://github.com/lucaslie/torchprune |
| Open Datasets | Yes | We test our compression framework on Res Net20 (He et al., 2016), Dense Net22 (Huang et al., 2017), WRN16-8 (Zagoruyko and Komodakis, 2016), and VGG16 (Simonyan and Zisserman, 2015) on CIFAR10 (Torralba et al., 2008); Res Net18 (He et al., 2016), Alex Net (Krizhevsky et al., 2012), and Mobile Net V2 (Sandler et al., 2018) on Image Net (Russakovsky et al., 2015); and on Deeplab V3 (Chen et al., 2017) with a Res Net50 backbone on Pascal VOC segmentation data (Everingham et al., 2015). |
| Dataset Splits | Yes | We train reference networks on CIFAR10, Image Net, and VOC, and then compress and retrain the networks once with r = e for various baseline comparisons and compression ratios. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments in the main text. It refers to supplementary material for compute resources. |
| Software Dependencies | No | The paper mentions 'grouped convolutions in Py Torch (Paszke et al., 2017)' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | No | The paper describes a unified compress-retrain pipeline with 'e epochs' for training and 'r epochs' for retraining, and mentions 'training hyperparameters from epochs [e r, e]', but it does not provide concrete hyperparameter values or detailed training configurations within the main text, referring to supplementary material instead. |