UPSCALE: Unconstrained Channel Pruning
Authors: Alvin Wan, Hanxiang Hao, Kaushik Patnaik, Yueyang Xu, Omer Hadad, David Güera, Zhile Ren, Qi Shan
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present extensive experimentation to show that unconstrained pruning can attain significantly higher accuracy than constrained pruning, especially for modern, larger models and for those with complex topologies. For these unconstrained pruned models, we then show that UPSCALE outperforms a baseline export in inference-time latency |
| Researcher Affiliation | Industry | Apple, Cupertino, USA. Correspondence to: Alvin Wan <alvinwan@apple.com>, Qi Shan <qshan@apple.com>. |
| Pseudocode | Yes | Algorithm 1 UPSCALE |
| Open Source Code | Yes | 1https://github.com/apple/ml-upscale |
| Open Datasets | Yes | All accuracies are reported on the Image Net ILSVRC 2015 (Russakovsky et al., 2015) validation dataset. |
| Dataset Splits | Yes | All accuracies are reported on the Image Net ILSVRC 2015 (Russakovsky et al., 2015) validation dataset. |
| Hardware Specification | Yes | We use a single V100 GPU with 32 GB RAM. |
| Software Dependencies | No | To export models for timing, we run an existing pruning strategy on the provided model, export using UPSCALE, then use Py Torch’s jit trace to produce a Python-less executable. This traced model is then benchmarked using Py Torch’s built-in profiling utility, including CUDA activities and tracking tensor memory allocation. The paper mentions PyTorch and CUDA but does not provide specific version numbers. |
| Experiment Setup | Yes | We sparsify parameters at intervals of 2.5% from 0% to 100% and test 5 pruning strategies across 15+ architectures. All our latency measurements are the aggregate of 100 runs, with both mean and standard deviations reported. We channel prune Dense Net121 at 10%, 20%, 30%, 40%, 50% parameter sparsity using the LAMP heuristic... We then fine-tune all 10 models for 5 epochs each. |