Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

PLUM: Improving Inference Efficiency By Leveraging Repetition-Sparsity Trade-Off

Authors: Sachit Kuhar, Yash Jain, Alexey Tumanov

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results demonstrate that PLUM s quantization method is more accurate than binary quantization with the same number of non-zero weights. Detailed analysis indicates that signed binarization generates a smaller distribution of effectual (non-zero) parameters nested within a larger distribution of total parameters of latent full-precision weights for a DNN block. Finally, the proposed PLUM framework achieves a 26% speedup on real hardware, doubles energy efficiency, and reduces density by 2.8 compared to binary methods while retaining top-1 accuracy when compared to prior-art methods for Res Nets on Image Net (by achieving 66.2% top-1 accuracy), presenting an alternative solution for deploying efficient models in resource-limited environments.
Researcher Affiliation Academia Sachit Kuhar , Yash Jain, Alexey Tumanov Georgia Institute of Technology
Pseudocode No The paper describes methods and mathematical formulations (e.g., Equation 1 and 2), but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code Yes Code available at https://github.com/sachitkuhar/PLUM
Open Datasets Yes We train Res Nets on the CIFAR10 and Image Net datasets... We train VGG, Alex Net, and Res Net on CIFAR10, SVHN (svh, 2011), and Tiny Image Net (Le & Yang) datasets, respectively. Table 13: Dataset with Licenses: License and source of the datasets used. Image Net Non-Commercial ILSVRC2012 CIFAR10 N/A CIFAR Licenses of Image Net (Deng et al., 2009) and CIFAR10 (Krizhevsky & Hinton, 2009) datasets used in this paper are listed in Table 1.
Dataset Splits Yes We train Res Nets on the CIFAR10 and Image Net datasets... Every accuracy reported in this paper is on validation set of the dataset. Image Net and CIFAR10 are standard publicly used datasets.
Hardware Specification Yes The proposed PLUM framework achieves a 26% speedup on real hardware... We deploy quantized Res Net-18 models on Intel CPUs... We run all experiments on Intel Xeon Gold 6226 CPU... 4 hour training NVIDIA T4 GPU.
Software Dependencies No The paper mentions specific tools and libraries like "Sum Merge Prabhakar et al. (2021)", "STONNE Muñoz-Matrínez et al. (2021)", "Py Torch frontend version of STONNE", and "Adam Optimizer Kingma & Ba (2014)". However, it does not provide specific version numbers for these software components.
Experiment Setup Yes We train from scratch for 350 epochs and use the Adam Optimizer Kingma & Ba (2014). We start with an initial learning rate of 0.01 and reduce it by a factor of 10 at epochs 150, 200, and 320. For apples-to-apples comparison with binary and ternary, we do a sweep over batch sizes {16, 32, 64, 128, 256} and activation functions (Re LU, PRe LU, Tan H) and report the best top-1 validation accuracy. For ablations on (1) value assignment percentage and (2) comparison with binary networks with comparable effectual operations, we select the batch size the to be 32 and activation function to be PRe LU... We decrease the learning rate from 2.0 e 4 to 2.0 e 8 while training for 320 epochs and do not use weight decay and batch size of 256 for training.