An Operator Theoretic View On Pruning Deep Neural Networks

Authors: William T Redman, MARIA FONOBEROVA, Ryan Mohr, Yannis Kevrekidis, Igor Mezic

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We make use of recent advances in dynamical systems theory, namely Koopman operator theory, to define a new class of theoretically motivated pruning algorithms. We show that these algorithms can be equivalent to magnitude and gradient based pruning, unifying these seemingly disparate methods, and find that they can be used to shed light on magnitude pruning s performance during the early part of training. We found that, for both Mnist Net and Res Net-20, KMP and GMP produced nearly identical results, both immediately after pruning and after one epoch of refinement (Fig. 1). These results support the claim developed above that KMP and GMP are equivalent in the long training time limit.
Researcher Affiliation Collaboration William T. Redman AIMdyn Inc. UC Santa Barbara wredman@ucsb.edu Maria Fonoberova & Ryan Mohr AIMdyn Inc. {mfonoberova, mohrr}@aimdyn.com Ioannis G. Kevrekidis Johns Hopkins University yannisk@jhu.edu Igor Mezi c AIMdyn Inc. UC Santa Barbara mezic@ucsb.edu
Pseudocode Yes Algorithm 1 General form of Koopman based pruning.
Open Source Code Yes Our code has been made publicly available 3. 3https://github.com/william-redman/Koopman pruning
Open Datasets Yes Mnist Net (Blalock et al., 2020), pre-trained on MNIST 1, and Res Net-20, pre-trained on CIFAR-10 2, are presented in Fig. 1. 1https://github.com/JJGO/shrinkbench-models/tree/master/mnist 2https://github.com/JJGO/shrinkbench-models/tree/master/cifar10
Dataset Splits No The paper mentions training on MNIST and CIFAR-10 and evaluating accuracy, but it does not specify explicit training, validation, and test dataset splits with percentages, sample counts, or references to predefined splits within the text.
Hardware Specification Yes All pruning experiments performed and reported in the main text were done on a 2014 Mac Book Air (1.4 GHz Intel Core i5) running Shrink Bench (Blalock et al., 2020).
Software Dependencies Yes Memory usage was computed using the Python module memory-profiler 0.58.04. 4https://pypi.org/project/memory-profiler/
Experiment Setup Yes All hyperparameters of the DNNs were the same as the off-the-shelf implementation of Shrink Bench, except that we allowed for pruning of the classifier layer. Training experiments were repeated independently three times, each with a different random seed. A single epoch s worth of data (i.e. 391 iterations for Res Net-20 and 469 iterations for Mnist Net) were used to construct the Koopman operator.