DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Authors: Wei Sun, Aojun Zhou, Sander Stuijk, Rob Wijnhoven, Andrew Oakleigh Nelson, hongsheng Li, Henk Corporaal

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our solution on the large scale Image Net dataset with both heavy Res Nets [19] and compact Reg Nets [20], and the smaller scale CIFAR100 dataset with Res Net56. Results of Res Net50 on Image Net show that layer-wise N:M sparsity searched by our framework achieves SOTA accuracy-compression trade-off under fine-grained N:M sparsity.
Researcher Affiliation Collaboration Wei Sun Aojun Zhou Sander Stuijk Andrew Nelson Rob Wijnhoven Hongsheng Li Henk Corporaal Eindhoven University of Technology Vi Notion B.V., the Netherlands CUHK-Sensetime Joint Lab, CUHK
Pseudocode Yes Algorithm 3.1 Domino Search
Open Source Code Yes Our code and models are publicly available at https://github.com/NM-sparsity/Domino Search.
Open Datasets Yes We demonstrate the advantages of layer-wise fine-grained N:M sparsity via various networks on the large-scale Image Net dataset [30] including heavy Res Nets [19] and compact Reg Nets [20]. We also demonstrate the effectiveness of our method on the smaller CIFAR100 dataset with Res Net56 [19].
Dataset Splits No The paper mentions 'Training settings' and that 'Detailed settings can be found in the Supplementary Material.' Without access to the supplementary material, explicit dataset split information like percentages or sample counts for training, validation, and testing are not provided in the main text.
Hardware Specification No The paper mentions 'Nvidia Ampere Sparse Tensor Core' as a general hardware innovation but does not provide specific details of the hardware (e.g., exact GPU models, CPU models, or memory specifications) used to run its experiments.
Software Dependencies No The paper mentions 'Pytorch pre-trained model zoo' but does not specify any software dependencies with version numbers (e.g., Python version, PyTorch version, CUDA version) needed for replication.
Experiment Setup Yes We set β1 = 0.5, β2 = 0.5 and use β1 = 0.8, β2 = 0.2 for searching under FLOPs constraint. Furthermore, algorithm 3.1 includes specific parameters such as 'Learning rate ξ. Interval Kc. Voting ratio vr.'