reproducibilityindex.ai

Matrix Compression via Randomized Low Rank and Low Precision Factorization

Authors: Rajarshi Saha, Varun Srivastava, Mert Pilanci

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate the efficacy of our algorithm in image compression, nearest neighbor classification of image and text embeddings, and compressing the layers of Lla Ma-7b. Our results illustrate that we can achieve compression ratios as aggressive as one bit per matrix coordinate, all while surpassing or maintaining the performance of traditional compression techniques.
Researcher Affiliation	Academia	Rajarshi Saha, Varun Srivastava, Mert Pilanci Department of Electrical Engineering Stanford University Stanford, CA 94305, USA {rajsaha,vsriva,pilanci}@stanford.edu
Pseudocode	Yes	Algorithm 1: LPLR: Randomized Low-Precision Low-Rank factorization. ... Algorithm 2: Direct-SVD quant.: Directly quantizing the optimal low-rank factorization.
Open Source Code	Yes	Our code is available at https://github.com/pilancilab/matrix-compressor.
Open Datasets	Yes	For CIFAR-10 and CIFAR-100, we embed the entire dataset using Mobile Net v3 (Howard et al. [24]) pretrained on Image Net (Deng et al. [12])... The IMDB (mte [2]) dataset consists of 25,000 train and test sentences... The Emotion (mte [1]) dataset is a sentiment analysis dataset, containing 16,000 train and 2000 test sentences...
Dataset Splits	No	The paper specifies training and test splits, for example, for CIFAR-10 it states: "The dataset is split into 50,000 training images and 10,000 test images". However, it does not explicitly mention a separate validation split or its size.
Hardware Specification	Yes	All experiments were performed on a single GPU NVIDIA TITAN RTX.
Software Dependencies	No	The main algorithm is implemented in Pytorch (Paszke et al. [46]), and utilizes Hugging Face [80] implementations of all datasets and large language models. The paper mentions software but does not specify version numbers for PyTorch or Hugging Face implementations.
Experiment Setup	Yes	We utilize a uniform bit budget B = B = 8 bits for the quantizers Q, Q across all cases.