Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
EfficientFormer: Vision Transformers at MobileNet Speed
Authors: Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show the superiority of Efficient Former in performance and speed on mobile devices. |
| Researcher Affiliation | Collaboration | 1Snap Inc. 2Northeastern University |
| Pseudocode | No | The paper describes methods like 'Latency Driven Slimming' and a 'gradient-based search algorithm' in prose, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Code and models are available at https://github.com/snap-research/Efficient Former. |
| Open Datasets | Yes | Our fastest model, Efficient Former-L1, achieves 79.2% top-1 accuracy on Image Net-1K [34] classification task |
| Dataset Splits | Yes | We experiment over COCO2017 [79] which contains training and validations sets of 118K and 5K images, respectively. |
| Hardware Specification | Yes | Our models are trained on a cluster with NVIDIA A100 and V100 GPUs. The inference speed on i Phone 12 (A14 bionic chip) is measured with i OS version 15 and averaged over 1,000 runs, with all available computing resources (NPU), or CPU only. |
| Software Dependencies | Yes | We implement Efficient Former through Py Torch 1.11 [73] and Timm library [74] |
| Experiment Setup | Yes | We follow the training recipe from Dei T [3] but mainly report results with 300 training epochs... We use Adam W optimizer [75, 76], warm-up training with 5 epochs, and a cosine annealing learning rate schedule. The initial learning rate is set as 10 3 (batch size 1024) and the minimum learning rate is 10 5. |