DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks
Authors: Wei Sun, Aojun Zhou, Sander Stuijk, Rob Wijnhoven, Andrew Oakleigh Nelson, hongsheng Li, Henk Corporaal
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our solution on the large scale Image Net dataset with both heavy Res Nets [19] and compact Reg Nets [20], and the smaller scale CIFAR100 dataset with Res Net56. Results of Res Net50 on Image Net show that layer-wise N:M sparsity searched by our framework achieves SOTA accuracy-compression trade-off under fine-grained N:M sparsity. |
| Researcher Affiliation | Collaboration | Wei Sun Aojun Zhou Sander Stuijk Andrew Nelson Rob Wijnhoven Hongsheng Li Henk Corporaal Eindhoven University of Technology Vi Notion B.V., the Netherlands CUHK-Sensetime Joint Lab, CUHK |
| Pseudocode | Yes | Algorithm 3.1 Domino Search |
| Open Source Code | Yes | Our code and models are publicly available at https://github.com/NM-sparsity/Domino Search. |
| Open Datasets | Yes | We demonstrate the advantages of layer-wise fine-grained N:M sparsity via various networks on the large-scale Image Net dataset [30] including heavy Res Nets [19] and compact Reg Nets [20]. We also demonstrate the effectiveness of our method on the smaller CIFAR100 dataset with Res Net56 [19]. |
| Dataset Splits | No | The paper mentions 'Training settings' and that 'Detailed settings can be found in the Supplementary Material.' Without access to the supplementary material, explicit dataset split information like percentages or sample counts for training, validation, and testing are not provided in the main text. |
| Hardware Specification | No | The paper mentions 'Nvidia Ampere Sparse Tensor Core' as a general hardware innovation but does not provide specific details of the hardware (e.g., exact GPU models, CPU models, or memory specifications) used to run its experiments. |
| Software Dependencies | No | The paper mentions 'Pytorch pre-trained model zoo' but does not specify any software dependencies with version numbers (e.g., Python version, PyTorch version, CUDA version) needed for replication. |
| Experiment Setup | Yes | We set β1 = 0.5, β2 = 0.5 and use β1 = 0.8, β2 = 0.2 for searching under FLOPs constraint. Furthermore, algorithm 3.1 includes specific parameters such as 'Learning rate ξ. Interval Kc. Voting ratio vr.' |