Learning Best Combination for Efficient N:M Sparsity

Authors: Yuxin Zhang, Mingbao Lin, ZhiHang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, Rongrong Ji

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments demonstrate that our learning best combination (LBC) performs consistently better than off-the-shelf N:M sparsity methods across various networks.
Researcher Affiliation Collaboration 1MAC Lab, School of Informatics, Xiamen University, Xiamen, China 2Tencent Youtu Lab, Shanghai, China 3Institute of Artificial Intelligence, Xiamen University, Xiamen China 4Pengcheng Lab, Shenzhen, China
Pseudocode Yes Algorithm 1: Learning Best Candidate (LBC).
Open Source Code Yes Our project is released at https://github.com/zyxxmu/LBC.
Open Datasets Yes Image Net [4]COCO benchmark [22].
Dataset Splits No The paper mentions using Image Net and COCO datasets but does not explicitly provide the specific training, validation, and test split percentages or sample counts.
Hardware Specification Yes We implement LBC using the Py Torch [31] upon 2 NVIDIA Tesla A100 GPUs.
Software Dependencies No The paper mentions software like PyTorch and Timm framework but does not provide specific version numbers for these dependencies.
Experiment Setup Yes Following Zhou et al. [40], we train the Res Nets for 120 epochs with an initial learning rate of 0, which is linearly increased to 0.1 during the first 5 epochs and then decayed to 0 scheduled by the cosine annealing. The SGD is adopted to update parameters with the weight decay and momentum setting as 0.0005 and 0.9. Besides, we adopt the Timm framework [38] to train Dei T with 300 epochs. To implement LBC, we set ti to 0 and tf to 1/2 of the total training epochs.