AiluRus: A Scalable ViT Framework for Dense Prediction
Authors: Jin Li, Yaoming Wang, XIAOPENG ZHANG, Bowen Shi, Dongsheng Jiang, Chenglin Li, Wenrui Dai, Hongkai Xiong, Qi Tian
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed method on three different datasets and observe promising performance. For example, the "Segmenter Vi T-L" model can be accelerated by 48% FPS without fine-tuning, while maintaining the performance. Additionally, our method can be applied to accelerate fine-tuning as well. Experimental results demonstrate that we can save 52% training time while accelerating 2.46 FPS with only a 0.09% performance drop. |
| Researcher Affiliation | Collaboration | Jin Li1, Yaoming Wang1, Xiaopeng Zhang2 Bowen Shi1 Dongsheng Jiang2 Chenglin Li1 Wenrui Dai1 Hongkai Xiong1 Qi Tian2 1Shanghai Jiao Tong University 2Huawei Cloud |
| Pseudocode | No | The paper describes its method using natural language and mathematical equations, but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code is available at https://github.com/caddyless/ailurus/tree/main. |
| Open Datasets | Yes | We evaluate our proposed method on three different datasets and observe promising performance. For example, the "Segmenter Vi T-L" model can be accelerated by 48% FPS without fine-tuning, while maintaining the performance. Additionally, our method can be applied to accelerate fine-tuning as well. Experimental results demonstrate that we can save 52% training time while accelerating 2.46 FPS with only a 0.09% performance drop. |
| Dataset Splits | Yes | The produced assignments are collected across the ADE20K [37] validation set. |
| Hardware Specification | Yes | We fine-tune the pre-trained modes on 8 V100-32G and evaluate the FPS on single V-100 32G. |
| Software Dependencies | No | The paper mentions using "MMsegmentation [5]" as its code base, but it does not provide specific version numbers for this or any other software dependencies. |
| Experiment Setup | Yes | We conducted hyper-parameter ablation experiments on the adaptive resolution strategy presented in Section 3.2 using the ADE20K semantic segmentation benchmark and the officially released Segmenter Vi T-L/16 [26] checkpoint. For the neighbor weight hyper-parameter α, we searched its value from 0.6 to 1.0 (1.0 indicates disabling this hyper-parameter), and the results showed that α = 0.9 performed best. Similarly, we searched the value of λ from 0 to 70 (0 indicates not using spatial information), and the results showed that λ = 50 performed best. The ablation results of k indicated that k = 1, i.e., choosing the closest token to calculate the local density, performed best. |