EPSD: Early Pruning with Self-Distillation for Efficient Model Compression
Authors: Dong Chen, Ning Liu, Yichen Zhu, Zhengping Che, Rui Ma, Fachao Zhang, Xiaofeng Mou, Yi Chang, Jian Tang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation covered diverse benchmarks (CIFAR-10/100, Tiny-Image Net, full Image Net, CUB-200-2011, and Pascal VOC), with EPSD outperforming advanced pruning and SD techniques. |
| Researcher Affiliation | Collaboration | Dong Chen1,2*, Ning Liu2*, Yichen Zhu2, Zhengping Che2, Rui Ma1 , Fachao Zhang2, Xiaofeng Mou2, Yi Chang1, Jian Tang2 1School of Artificial Intelligence, Jilin University 2Midea Group |
| Pseudocode | No | The paper describes its method in steps and uses mathematical equations, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide any statement or link indicating that source code for the methodology is openly available. |
| Open Datasets | Yes | We evaluate EPSD on various benchmarks, including CIFAR-10/CIFAR-100 (Krizhevsky, Hinton et al. 2009), Tiny-Image Net, and full Image Net (Deng et al. 2009) using diverse networks and comparing with the Simple Combination approach, advanced pruning and SD methods. |
| Dataset Splits | No | The paper refers to training and testing but does not explicitly provide specific percentages, sample counts, or a detailed methodology for train/validation/test dataset splits needed for reproduction. |
| Hardware Specification | No | The paper discusses training efforts and wall time but does not specify the exact hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies or version numbers (e.g., PyTorch 1.x, Python 3.x). |
| Experiment Setup | Yes | We incorporate three distinct SD algorithms (CS-KD (Yun et al. 2020), PS-KD (Kim et al. 2021), and DLB (Shen et al. 2022)) into EPSD to ensure a comprehensive evaluation. Our experiments are conducted on CIFAR-10/100 and Tiny Image Net datasets across five sparsity ratios (36%, 59%, 79%, 90%, 95%). To ensure fairness in comparison, we employ identical hyper-parameters for training each dataset. |