Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression

Authors: Bae Seong Park, Se Jung Kwon, Daehwan Oh, Byeongwook Kim, Dongsoo Lee

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that our proposed compression scheme achieves almost the maximum compression ratio for the Transformer and Res Net-50 pruned by various fine-grained pruning methods. [...] In this section, we demonstrate the encoding capability of our proposed sequential encoding techniques using synthetic random data and NNs pruned by various pruning methods.
Researcher Affiliation Industry 1NAVER CLOVA, {baesung.park,sejung.kwon,byeonguk.kim,dongsoo.lee}@navercorp.com 2Samsung Research, dhdh.oh@samsung.com
Pseudocode Yes Algorithm 1: SpMV (CSR format) [...] Algorithm 2: Proposed SpMV (using encoded weights) [...] Algorithm 3: Encoding algorithm when Ns = 2.
Open Source Code No The paper mentions a link (https://github.com/google-research/google-research/tree/master/state_of_sparsity) which is for models used in their experiments, not for the source code of their proposed encoding methodology.
Open Datasets Yes We measure compression capability of our proposed sequential encoding scheme using sparse Transformer (Vaswani et al., 2017) on WMT 14 en-de dataset and Res Net-50 (He et al., 2016) on Image Net.
Dataset Splits No The paper uses the WMT 14 en-de dataset and ImageNet but does not explicitly provide information regarding specific train, validation, or test splits for these datasets.
Hardware Specification Yes MKL library (operated by i7-7700 @ 3.6GHz) and CUDA 10.2 library (performed by n VIDIA V100) perform sparse matrix multiplications whose execution times are normalized with respect to corresponding dense matrix multiplications (i.e., using a dense (2048 2048) matrix).
Software Dependencies Yes MKL library (operated by i7-7700 @ 3.6GHz) and CUDA 10.2 library (performed by n VIDIA V100) perform sparse matrix multiplications whose execution times are normalized with respect to corresponding dense matrix multiplications (i.e., using a dense (2048 2048) matrix).
Experiment Setup Yes For our experiments, Nin is selected to be 8 such that we feed a decoder on a byte-level. [...] Specifically, for a given set of Nin and Nout, an element of M RNout ((Ns+1) Nin) is randomly assigned to 0 or 1 with equal probability. [...] For the Res Net-50 model (on Image Net), we also consider signed INT8 format.