MG-ViT: A Multi-Granularity Method for Compact and Efficient Vision Transformers
Authors: Yu Zhang, Yepeng Liu, Duoqian Miao, Qi Zhang, Yiwei Shi, Liang Hu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments Prove the effectiveness of the multigranularity strategy. For instance, on Image Net, without any loss of performance, MG-Vi T reduces 47% FLOPs of LV-Vi T-S and 56% FLOPs of Dei T-S. |
| Researcher Affiliation | Academia | Yu Zhang1 Yepeng Liu2 Duoqian Miao1 Qi Zhang1 Yiwei Shi3 Liang Hu1 1Tongji University 2University of Florida 3University of Bristol |
| Pseudocode | No | The paper contains architectural diagrams and mathematical formulas but does not include any explicit pseudocode blocks or sections labeled 'Algorithm'. |
| Open Source Code | No | The paper does not include any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We select LV-Vi T [17] and Dei T [15] to assess the performance of MG-Vi T on Image Net [18]. ...object detection and semantic segmentation on the MS-COCO [19] and ADE20K [20] datasets, respectively. |
| Dataset Splits | No | The paper mentions using standard datasets like ImageNet, MS-COCO, and ADE20K, which have predefined splits. However, it does not explicitly state the training/validation/test dataset splits by percentage, absolute sample counts, or specific split files. |
| Hardware Specification | Yes | All metrics are measured on a single NVIDIA RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions using the 'Adam W optimizer' but does not specify version numbers for other key software components or libraries such as Python, PyTorch/TensorFlow, or CUDA. |
| Experiment Setup | Yes | The resolution of input images in our experiments is 224 224. In SGIS, we split each image into 7 7 patches. rh and rt are set to 0.1 and 0.4, respectively. For Dei T-S, we inserted a total of three PPSM modules in the 3rd, 7th, and 10th layers. For conducting the training process, we set the batch size to 256 and use Adam W optimizer to train all models for 300 epochs. |