Learning Best Combination for Efficient N:M Sparsity
Authors: Yuxin Zhang, Mingbao Lin, ZhiHang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, Rongrong Ji
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments demonstrate that our learning best combination (LBC) performs consistently better than off-the-shelf N:M sparsity methods across various networks. |
| Researcher Affiliation | Collaboration | 1MAC Lab, School of Informatics, Xiamen University, Xiamen, China 2Tencent Youtu Lab, Shanghai, China 3Institute of Artificial Intelligence, Xiamen University, Xiamen China 4Pengcheng Lab, Shenzhen, China |
| Pseudocode | Yes | Algorithm 1: Learning Best Candidate (LBC). |
| Open Source Code | Yes | Our project is released at https://github.com/zyxxmu/LBC. |
| Open Datasets | Yes | Image Net [4]COCO benchmark [22]. |
| Dataset Splits | No | The paper mentions using Image Net and COCO datasets but does not explicitly provide the specific training, validation, and test split percentages or sample counts. |
| Hardware Specification | Yes | We implement LBC using the Py Torch [31] upon 2 NVIDIA Tesla A100 GPUs. |
| Software Dependencies | No | The paper mentions software like PyTorch and Timm framework but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Following Zhou et al. [40], we train the Res Nets for 120 epochs with an initial learning rate of 0, which is linearly increased to 0.1 during the first 5 epochs and then decayed to 0 scheduled by the cosine annealing. The SGD is adopted to update parameters with the weight decay and momentum setting as 0.0005 and 0.9. Besides, we adopt the Timm framework [38] to train Dei T with 300 epochs. To implement LBC, we set ti to 0 and tf to 1/2 of the total training epochs. |