Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Authors: Shoufa Chen, Chongjian GE, Zhan Tong, Jiangliu Wang, Yibing Song, Jue Wang, Ping Luo
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on five image and video datasets show that Adapt Former largely improves Vi Ts in the target domains. |
| Researcher Affiliation | Collaboration | Shoufa Chen1 Chongjian Ge1 Zhan Tong2 Jiangliu Wang2 Yibing Song2 Jue Wang2 Ping Luo1 1The University of Hong Kong 2Tencent AI Lab |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/Shoufa Chen/Adapt Former. |
| Open Datasets | Yes | Image domain : CIFAR-100 [54]... Street View House Numbers (SVHN) [37]... The Food-101 [9] dataset... Video domain : Something-Something V2 (SSv2) [39]... HMDB51 [55]... NUS-WIDE [24] |
| Dataset Splits | Yes | CIFAR-100 [54] contains 50,000 training images and 10,000 validation images... Something-Something V2 (SSv2) [39]... It consists of 168,913 training samples, 24,777 validation samples and 27,157 testing samples... HMDB51 [55] is composed of 6,849 videos with 51 categories, making a split of 3.5k/1.5k train/val videos. |
| Hardware Specification | Yes | In this work, we use Py Torch toolkit [68] to conduct all experiments on NVIDIA V100 GPUs. |
| Software Dependencies | No | The paper states 'we use Py Torch toolkit [68]', but it does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | Unless otherwise stated, we use 8 8 GPUs for video experiments and 1 8 GPUs for image experiments. ... For the newly added modules, the weights of down-projection layers are initialized with Kaiming Normal [44], while the biases of the additional networks and the weights of the up-projection layers are configured with zero initialization. ... We trained all models for 40 epochs using Adam optimize and 1-cycle learning rate policy [73]. The maximal learning rate is 0.001. |