Dynamic Sparse Graph for Efficient Deep Learning
Authors: Liu Liu, Lei Deng, Xing Hu, Maohua Zhu, Guoqi Li, Yufei Ding, Yuan Xie
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show significant memory saving (1.7-4.5x) and operation reduction (2.3-4.4x) with little accuracy loss on various benchmarks. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of California, Santa Barbara 2Department of Computer Science, University of California, Santa Barbara 3Center for Brain Inspired Computing Research, Department of Precision Instrument, Tsinghua University |
| Pseudocode | Yes | Algorithm 1: DSG training |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code for the described methodology. |
| Open Datasets | Yes | Regarding the evaluation network models, we use Le Net (Le Cun et al., 1998) and a multi-layered perceptron (MLP) on small-scale FASHION dataset (Xiao et al., 2017), VGG8... on medium-scale CIFAR10 dataset (Krizhevsky & Hinton, 2009), VGG8/WRN8-2 on another medium-scale CIFAR100 dataset (Krizhevsky & Hinton, 2009), and Alex Net... on large-scale Image Net dataset (Deng et al., 2009) as workloads. |
| Dataset Splits | No | The paper mentions 'validation set' and 'validation accuracy' but does not explicitly provide the specific dataset split percentages or methodologies for creating these splits. |
| Hardware Specification | Yes | The programming framework is Py Torch and the training platform is based on NVIDIA Titan Xp GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' as the programming framework and 'MKL compute library' but does not specify any version numbers for these software dependencies. |
| Experiment Setup | Yes | The projection matrices are fixed after a random initialization at the beginning of training. We just update the projected weights in the low-dimensional space every 50 iterations to reduce the projection overhead. |