reproducibilityindex.ai

ESPACE: Dimensionality Reduction of Activations for Model Compression

Authors: Charbel Sakr, Brucek Khailany

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we report on experimental studies investigating LLM compression using ESPACE. Accuracy is evaluated in two ways: perplexity measured on the Wikitext-103 dataset [36] and zero-shot downstream task accuracy of: Bool Q (BQ) [37], Hellaswag (HS) [38], PIQA (PQ) [39], RACE (RA) [40], and Wino Grande (WG) [41].
Researcher Affiliation	Industry	Charbel Sakr NVIDIA Research csakr@nvidia.com Brucek Khailany NVIDIA Research bkhailany@nvidia.com
Pseudocode	No	The paper does not contain any clearly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	No	As such, we believe the description of the work in the paper is sufficient for reproducibility; yet, we are happy to consider open sourcing our code in the future.
Open Datasets	Yes	Accuracy is evaluated in two ways: perplexity measured on the Wikitext-103 dataset [36] and zero-shot downstream task accuracy of: Bool Q (BQ) [37], Hellaswag (HS) [38], PIQA (PQ) [39], RACE (RA) [40], and Wino Grande (WG) [41]. ... Retraining simply extends the models pre-training sessions and uses the 330B-token MTNLG dataset [43], which was used to train GPT3 models.
Dataset Splits	Yes	The Wikitext-103 dataset is split into train, validation, and test sets. We use 512 random sequences from the training set for calibrating projection matrices required by ESPACE. We use the validation set for layer-wise sensitivity studies.
Hardware Specification	Yes	We measure using a NVIDIA A100 GPU and a simple, un-optimized implementation (see Appendix B.4).
Software Dependencies	No	Our implementation is built on top of Megatron-LM [33] which itself is based on the Pytorch framework. ... We then use the CUPY library in RAPIDS to perform fast (a few milliseconds per auto-correlation matrix) eigenvalue decomposition on GPUs. (Specific version numbers for these software components are not provided.)
Experiment Setup	Yes	For GPT3-1.3B, the initial learning rate is set to 1.0 10-4, the final learning rate is set to 1.0 10-5, and the global batch size is set to 512. (Similar details are provided for other models in Appendix B.3).