Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression
Authors: Bae Seong Park, Se Jung Kwon, Daehwan Oh, Byeongwook Kim, Dongsoo Lee
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that our proposed compression scheme achieves almost the maximum compression ratio for the Transformer and Res Net-50 pruned by various fine-grained pruning methods. [...] In this section, we demonstrate the encoding capability of our proposed sequential encoding techniques using synthetic random data and NNs pruned by various pruning methods. |
| Researcher Affiliation | Industry | 1NAVER CLOVA, {baesung.park,sejung.kwon,byeonguk.kim,dongsoo.lee}@navercorp.com 2Samsung Research, dhdh.oh@samsung.com |
| Pseudocode | Yes | Algorithm 1: SpMV (CSR format) [...] Algorithm 2: Proposed SpMV (using encoded weights) [...] Algorithm 3: Encoding algorithm when Ns = 2. |
| Open Source Code | No | The paper mentions a link (https://github.com/google-research/google-research/tree/master/state_of_sparsity) which is for models used in their experiments, not for the source code of their proposed encoding methodology. |
| Open Datasets | Yes | We measure compression capability of our proposed sequential encoding scheme using sparse Transformer (Vaswani et al., 2017) on WMT 14 en-de dataset and Res Net-50 (He et al., 2016) on Image Net. |
| Dataset Splits | No | The paper uses the WMT 14 en-de dataset and ImageNet but does not explicitly provide information regarding specific train, validation, or test splits for these datasets. |
| Hardware Specification | Yes | MKL library (operated by i7-7700 @ 3.6GHz) and CUDA 10.2 library (performed by n VIDIA V100) perform sparse matrix multiplications whose execution times are normalized with respect to corresponding dense matrix multiplications (i.e., using a dense (2048 2048) matrix). |
| Software Dependencies | Yes | MKL library (operated by i7-7700 @ 3.6GHz) and CUDA 10.2 library (performed by n VIDIA V100) perform sparse matrix multiplications whose execution times are normalized with respect to corresponding dense matrix multiplications (i.e., using a dense (2048 2048) matrix). |
| Experiment Setup | Yes | For our experiments, Nin is selected to be 8 such that we feed a decoder on a byte-level. [...] Specifically, for a given set of Nin and Nout, an element of M RNout ((Ns+1) Nin) is randomly assigned to 0 or 1 with equal probability. [...] For the Res Net-50 model (on Image Net), we also consider signed INT8 format. |