Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Best Combination for Efficient N:M Sparsity

Authors: Yuxin Zhang, Mingbao Lin, ZhiHang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, Rongrong Ji

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments demonstrate that our learning best combination (LBC) performs consistently better than off-the-shelf N:M sparsity methods across various networks.
Researcher Affiliation Collaboration 1MAC Lab, School of Informatics, Xiamen University, Xiamen, China 2Tencent Youtu Lab, Shanghai, China 3Institute of Artificial Intelligence, Xiamen University, Xiamen China 4Pengcheng Lab, Shenzhen, China
Pseudocode Yes Algorithm 1: Learning Best Candidate (LBC).
Open Source Code Yes Our project is released at https://github.com/zyxxmu/LBC.
Open Datasets Yes Image Net [4]COCO benchmark [22].
Dataset Splits No The paper mentions using Image Net and COCO datasets but does not explicitly provide the specific training, validation, and test split percentages or sample counts.
Hardware Specification Yes We implement LBC using the Py Torch [31] upon 2 NVIDIA Tesla A100 GPUs.
Software Dependencies No The paper mentions software like PyTorch and Timm framework but does not provide specific version numbers for these dependencies.
Experiment Setup Yes Following Zhou et al. [40], we train the Res Nets for 120 epochs with an initial learning rate of 0, which is linearly increased to 0.1 during the first 5 epochs and then decayed to 0 scheduled by the cosine annealing. The SGD is adopted to update parameters with the weight decay and momentum setting as 0.0005 and 0.9. Besides, we adopt the Timm framework [38] to train Dei T with 300 epochs. To implement LBC, we set ti to 0 and tf to 1/2 of the total training epochs.