Deformable Butterfly: A Highly Structured and Sparse Linear Transform

Authors: Rui Lin, Jie Ran, King Hung Chiu, Graziano Chesi, Ngai Wong

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test the proposed De But chains on Le Net (Table 1, left), VGG-16-BN (Table 1, right) and Res Net50 [11] using the standard MNIST [21], CIFAR-10 [17] and Image Net [8] datasets, respectively. The results are presented in the subsections below.
Researcher Affiliation Collaboration 1 Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong Emial Address: {linrui, jieran, chesi, nwong}@eee.hku.hk 2 United Microelectronics Centre (Hong Kong) Limited, Hong Kong Science Park, N.T., Hong Kong Emial Address: khchiu@umechk.com
Pseudocode No The paper describes its algorithms and processes using prose and mathematical equations but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The codes and Appendix are publicly available at: https://github.com/ruilin0212/De But.
Open Datasets Yes We test the proposed De But chains on Le Net (Table 1, left), VGG-16-BN (Table 1, right) and Res Net50 [11] using the standard MNIST [21], CIFAR-10 [17] and Image Net [8] datasets, respectively.
Dataset Splits Yes We test the proposed De But chains on Le Net (Table 1, left), VGG-16-BN (Table 1, right) and Res Net50 [11] using the standard MNIST [21], CIFAR-10 [17] and Image Net [8] datasets, respectively.
Hardware Specification Yes All coding is done with Py Torch, and experiments run on an NVIDIA Ge Force GTX1080 Ti Graphics Card with 11GB frame buffer.
Software Dependencies No The paper mentions "Py Torch" as the coding framework but does not specify a version number for it or any other software dependency.
Experiment Setup Yes On MNIST and CIFAR10 datasets, the learning rate is 0.01 with a decaying step of 50, the batch size and the number of epochs are set to 64 and 150, respectively. As for Image Net training, the decaying step is 30 and the training warms up in the first 5 epochs. The batch size and the number of epochs are 128 and 100, respectively. We use the standard stochastic gradient descent (SGD) for fine-tuning.