Decomposable-Net: Scalable Low-Rank Compression for Neural Networks
Authors: Atsushi Yaguchi, Taiji Suzuki, Shuhei Nitta, Yukinobu Sakata, Akiyuki Tanizawa
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments on the Image Net classification task, Decomposable-Net yields superior accuracy in a wide range of model sizes. We evaluate our methods on image-classification tasks of CIFAR-10/100 [Krizhevsky, 2009] and Image Net [Deng et al., 2009] datasets using deep CNNs. |
| Researcher Affiliation | Collaboration | Atsushi Yaguchi1 , Taiji Suzuki2,3 , Shuhei Nitta1 , Yukinobu Sakata1 and Akiyuki Tanizawa1 1Toshiba Corporation, Japan 2The University of Tokyo, Japan 3RIKEN Center for Advanced Intelligence Project, Japan |
| Pseudocode | Yes | The pseudo-code for learning Decomposable-Net is given in Algorithm 1. |
| Open Source Code | No | The paper provides links to third-party code used for comparison (e.g., TRP, VBMF) but does not state that the code for Decomposable-Net itself is open-source or provides a link to it. |
| Open Datasets | Yes | We evaluate our methods on image-classification tasks of CIFAR-10/100 [Krizhevsky, 2009] and Image Net [Deng et al., 2009] datasets using deep CNNs. |
| Dataset Splits | Yes | All methods are evaluated in terms of the tradeoff between validation (top-1) accuracy and the number of multiply-accumulate operations (MACs). ... we follow the same baseline setup for the CIFAR datasets as used by [Zagoruyko and Komodakis, 2016], and the setup of [Yu et al., 2019] for the Image Net dataset. |
| Hardware Specification | Yes | The wall-clock time for training Decomposable-Net with Res Net-50 on the Image Net dataset was 6.9 days with eight NVIDIA V100 GPUs |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) used for its experiments. |
| Experiment Setup | Yes | We experimentally tuned hyperparameters, and set αu = 0.25, 0.5, and 0.8, respectively for CIFAR-10, CIFAR-100, and Image Net datasets, while fixing αl = 0.01 for all datasets. Except for the results shown in Figure 2(a), we set λ = 0.5 to balance the performance of a fulland low-rank models. |