Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Memory Efficient Transformer Adapter for Dense Predictions
Authors: Dong Zhang, Rui Yan, Pingcheng Dong, Kwang-Ting Cheng
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, extensive evaluations on multiple representative datasets validate that META substantially enhances the predicted quality, while achieving a new state-of-the-art accuracy-efficiency trade-off. Theoretically, we demonstrate that META exhibits superior generalization capability and stronger adaptability. |
| Researcher Affiliation | Academia | Dong Zhang1,2, Rui Yan3, Pingcheng Dong1, Kwang-Ting Cheng1 1The Hong Kong University of Science and Technology 2AI Chip Center for Emerging Smart Systems (ACCESS), 3Nanjing University EMAIL;EMAIL;EMAIL |
| Pseudocode | No | The paper describes the architecture and computational processes using mathematical formulas and descriptive text, but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | To facilitate a fair result comparison with existing methods, we conduct experiments, including the ablation analysis, on two commonly used datasets: MS-COCO (Caesar et al., 2018) for ODet and ISeg, and ADE20K (Zhou et al., 2017) for SSeg. |
| Dataset Splits | Yes | We report the experimental results on the val set of MS-COCO (Caesar et al., 2018), where the Image Net-1k pre-trained Vi T-B (Li et al., 2022b) is used as the backbone. For SSeg, we choose Uper Net (Xiao et al., 2018) with 160k iterations as the baseline, where the Image Net-1k pre-trained Vi T-B (Li et al., 2022b) is used as the backbone. We report the single-scale testing results on the val set of ADE20K (Zhou et al., 2017). |
| Hardware Specification | Yes | The reported inference results are measured by A100 GPUs with per-GPU batch size 2. |
| Software Dependencies | No | The paper mentions various models and baselines (e.g., Mask R-CNN, Cascade Mask R-CNN, Vi T-Adapter) but does not specify software versions for programming languages, libraries, or frameworks used for implementation (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Unless otherwise specified, these baselines are set up to be consistent with their papers and the settings of the Vi T-Adapter (Chen et al., 2022b) method. Even with different training schedules (i.e., 1 , and 3 with MS), our method can also improve the model performance, demonstrating the plug-and-play advantage of META. For SSeg, we choose Uper Net (Xiao et al., 2018) with 160k iterations as the baseline, where the Image Net-1k pre-trained Vi T-B (Li et al., 2022b) is used as the backbone. |