reproducibilityindex.ai

Sharpness-Aware Data Generation for Zero-shot Quantization

Authors: Hoang Anh Dung, Cuong Pham, Trung Le, Jianfei Cai, Thanh-Toan Do

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental evaluations on CIFAR-100 and Image Net datasets demonstrate the superiority of the proposed method over the state-of-the-art techniques in low-bit quantization settings.
Researcher Affiliation	Academia	Department of Data Science and AI, Monash University, Melbourne, Australia.
Pseudocode	Yes	Algorithm 1 SA zero-shot quantization.
Open Source Code	No	The paper states that Genie's official released code was used to produce some results, but does not explicitly state that the code for SADAG (their method) is released or provide a link.
Open Datasets	Yes	We evaluate our approach on CIFAR-100 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) datasets, which are commonly utilized for zero-shot quantization.
Dataset Splits	No	The paper states the use of CIFAR-100 and Image Net datasets but does not explicitly provide specific train/validation/test split percentages or sample counts for its experiments.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or cloud computing instances) used for running its experiments.
Software Dependencies	No	The paper mentions using the Adam optimizer and specific learning rate schedulers, but does not provide version numbers for any software components or libraries used.
Experiment Setup	Yes	The learning rates of the generator and embedding are initially set at 0.1 and 0.01, respectively. We adopt the Adam optimizer (Kingma & Ba, 2014) for both generator and data embedding, but utilize different schedulers for them, i.e., the Exponential LR scheduler and Reduce LRon Plateau scheduler are used for scheduling the learning rates of the generator and the embeddings, respectively. Across all experiments, the batch size for the data generation process is set to 128, while in the quantization step, we keep the batch size at 32. The threshold ζ in Eq. (19) is set to 0 or 0.1. The radius ν in Eq. (16) for the embedding perturbation is set to 2.