Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Focal Modulation Networks
Authors: Jianwei Yang, Chunyuan Li, Xiyang Dai, Jianfeng Gao
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show Focal Nets outperform the state-of-the-art SA counterparts (e.g., Swin and Focal Transformers) with similar computational cost on the tasks of image classification, object detection, and semantic segmentation. |
| Researcher Affiliation | Industry | Jianwei Yang, Chunyuan Li, Xiyang Dai, Jianfeng Gao EMAIL |
| Pseudocode | Yes | Algorithm 1: Pseudo code for Focal Modulation. |
| Open Source Code | Yes | Code is available at: https://github.com/microsoft/Focal Net. |
| Open Datasets | Yes | We compare different methods on Image Net-1K classification [16]. Overall, we train Focal Net-T, Focal Net-S and Focal Net-B with Image Net-1K training set... When pretrained on Image Net-22K... We make comparisons on object detection with COCO 2017 [42]. We use ADE20K [95] for our experiments |
| Dataset Splits | Yes | report Top-1 accuracy (%) on the validation set. ...evaluated on 5K validation images. ...ADE20K validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Pytorch-style pseudo code' and 'Adam W' as optimizer but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For training, we use Adam W [48] as the optimizer with initial learning rate 10 4 and weight decay 0.05. All models are trained with batch size 16. We set the stochastic drop rates to 0.1, 0.2, 0.3 in 1 and 0.3, 0.5, 0.5 in 3 training schedule for Focal Net-T/S/B, respectively. |