SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization

Authors: Chuanyang Zheng, zheyang li, Kai Zhang, Zhi Yang, Wenming Tan, Jun Xiao, Ye Ren, Shiliang Pu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of our method. Notably, the proposed approach outperforms the existing state-of-the-art approaches on Image Net, increasing accuracy by 0.7% over the Dei T-Base baseline while saving 50% FLOPs. On COCO, we are the first to show that 70% FLOPs of Faster R-CNN with Vi T backbone can be removed with only 0.3% m AP drop.
Researcher Affiliation Collaboration Chuanyang Zheng1, Zheyang Li1,2, Kai Zhang1, Zhi Yang1, Wenming Tan1, Jun Xiao2, Ye Ren1, Shiliang Pu1 1 Hikvision Research Institute, Hangzhou, China 2 Zhejiang University, Hangzhou, China
Pseudocode Yes Algorithm 1 Collaborative Pruning with EA Input: Pre-trained model To, FLOPs constraint Cbudget, dataset D, search iterations E, population size Q, components number M, fitness value f Output: Optimal pruned model
Open Source Code Yes The code is available at https://github.com/hikvision-research/SAVi T.
Open Datasets Yes The pruning process is performed on the pre-trained Dei T 1 released from official implementation on Image Net-1k [41]. [...] We employ pruning on the popular object detection framework Faster R-CNN [48] with Swin-Tiny backbone on COCO 2017 dataset [49] and report mean Average Precision (m AP) for comparison.
Dataset Splits No The paper uses standard datasets like ImageNet-1k and COCO 2017, which have predefined splits, and mentions fine-tuning. However, it does not explicitly provide the specific training/validation/test split percentages or sample counts used in *this* work, nor does it explicitly state that it adheres to a *specific* predefined split from a citation within the context of its own experimental setup.
Hardware Specification Yes Run time speedup of compressed Dei T on Nvidia V100.
Software Dependencies No The paper mentions using 'Dei T 1 released from official implementation' and 'official pre-trained Swin2' but does not list specific software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, CUDA 11.x).
Experiment Setup Yes After pruning, we fine-tune the pruned network using the same setting as Dei T [23] without warm-up. [...] Finally, we fine-tune the pruned network for 300 epochs under the same strategies as Swin [3]. [...] For Dei T-Base, we fine-tune the pruned models for 80 epochs following the identical setting in Section 4.1.