Focal Modulation Networks

Authors: Jianwei Yang, Chunyuan Li, Xiyang Dai, Jianfeng Gao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show Focal Nets outperform the state-of-the-art SA counterparts (e.g., Swin and Focal Transformers) with similar computational cost on the tasks of image classification, object detection, and semantic segmentation.
Researcher Affiliation Industry Jianwei Yang, Chunyuan Li, Xiyang Dai, Jianfeng Gao {jianwyan,chunyl,xidai,jfgao}@microsoft.com
Pseudocode Yes Algorithm 1: Pseudo code for Focal Modulation.
Open Source Code Yes Code is available at: https://github.com/microsoft/Focal Net.
Open Datasets Yes We compare different methods on Image Net-1K classification [16]. Overall, we train Focal Net-T, Focal Net-S and Focal Net-B with Image Net-1K training set... When pretrained on Image Net-22K... We make comparisons on object detection with COCO 2017 [42]. We use ADE20K [95] for our experiments
Dataset Splits Yes report Top-1 accuracy (%) on the validation set. ...evaluated on 5K validation images. ...ADE20K validation set.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions 'Pytorch-style pseudo code' and 'Adam W' as optimizer but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes For training, we use Adam W [48] as the optimizer with initial learning rate 10 4 and weight decay 0.05. All models are trained with batch size 16. We set the stochastic drop rates to 0.1, 0.2, 0.3 in 1 and 0.3, 0.5, 0.5 in 3 training schedule for Focal Net-T/S/B, respectively.