Hardware-Aware Compression with Random Operation Access Specific Tile (ROAST) Hashing

Authors: Aditya Desai, Keren Zhou, Anshumali Shrivastava

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6. Experimental Evaluation Setup: In this section, we evaluate the ROAST compression approach on two types of tasks. The details of the tasks, datasets, and models used are mentioned in Table 2. . For image-classification tasks, we choose the cifar-10 dataset and the leader for the Dawn Benchmark (Coleman et al., 2017) a Res Net-9 model2 for cifar-10.
Researcher Affiliation Collaboration Aditya Desai 1 Keren Zhou 1 Anshumali Shrivastava 1 2 1 Department of Computer Science, Rice University, Houston, Texas, United States 2 Third AI Corp, Houston, Texas, United States.
Pseudocode Yes The pseudo code for ROAST-MM is shown in algorithm 1.
Open Source Code Yes ROAST-MM kernel implementation is open-source 1 1https://github.com/apd10/Rz Linear/tree/stable
Open Datasets Yes For image-classification tasks, we choose the cifar-10 dataset and the leader for the Dawn Benchmark (Coleman et al., 2017) a Res Net-9 model2 for cifar-10. ... We use the two largest available text-classification datasets for NLP tasks on huggingface (Hugging Face, 2022).
Dataset Splits No Table 2 provides the number of samples for training and testing for each dataset (e.g., 'amazon-polarity 3.6M/0.4M' which implies train/test split). However, it does not explicitly provide specific split information or sample counts for a separate validation dataset.
Hardware Specification Yes The measurements were taken using TF32 on a NVIDIA A100 GPU (48GB).
Software Dependencies No The paper mentions software like Triton, cu BLAS, CUTLASS, and PyTorch, but does not specify their version numbers, which is necessary for reproducibility.
Experiment Setup Yes The other hyperparameters for NLP tasks are { batch 64 for amazon-polarity and 32 for yelp-polarity, learning rate 2e-5, Adam W optimizer, Linear scheduler}. Pruning is used as a baseline. We use iterative magnitude pruning interspersed with training. We use two schedules for pruning. full-9-1-schedule ( alt. full-1-9-schedule ) means we start with the fully trained model and then perform iterative magnitude pruning to require sparsity in 9 ( alt. 1) epochs and finally perform 1 ( alt. 9) epoch at final sparsity.