Unified Visual Transformer Compression

Authors: Shixing Yu, Tianlong Chen, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments are conducted with several Vi T variants, e.g. Dei T and T2T-Vi T backbones on the Image Net dataset, and our approach consistently outperforms recent competitors.
Researcher Affiliation Collaboration 1University of Texas at Austin, 2Texas A&M University, 3Kwai Inc.
Pseudocode Yes Algorithm 1: Gradient-based algorithm to solve problem (5) for Unified Vi T Compression. Input: Resource budget Rbudget, learning rates η1, η2, η3, η4, η5, η6, number of total iterations τ. Result: Transformer pruned weights W .
Open Source Code Yes Codes are available online: https://github.com/VITA-Group/UVC.
Open Datasets Yes We conduct experiments for image classification on Image Net (Krizhevsky et al., 2012).
Dataset Splits No The paper states 'We conduct experiments for image classification on Image Net (Krizhevsky et al., 2012),' and mentions 'validation' in section '3.1 PRELIMINARY' and the JSON schema itself has a 'validation' field, but it does not provide specific details on the dataset splits (e.g., percentages or sample counts for training, validation, and testing).
Hardware Specification No The paper does not provide specific details regarding the hardware used for experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as programming languages or libraries used in the implementation.
Experiment Setup Yes Numerically, the learning rate for parameter z is always changing during the primal-dual algorithm process. Thurs, we propose to use a dynamic learning rate for the parameter z that controls the budget constraint. We use a four-step schedule of {1, 5, 9, 13, 17} in practice.