Efficient Modulation for Vision Networks

Authors: Xu Ma, Xiyang Dai, Jianwei Yang, Bin Xiao, Yinpeng Chen, Yun Fu, Lu Yuan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we validate our Efficient Mod on four tasks: image classification on Image Net1K (Deng et al., 2009), object detection and instance segmentation on MS COCO (Lin et al., 2014), and semantic segmentation on ADE20K (Zhou et al., 2017).
Researcher Affiliation Collaboration Xu Ma1, Xiyang Dai2, Jianwei Yang2, Bin Xiao2, Yinpeng Chen2, Yun Fu1, Lu Yuan2 1Northeastern University 2Microsoft
Pseudocode No The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code Yes Code and checkpoints are available at https://github.com/ma-xu/Efficient Mod.
Open Datasets Yes We validate our Efficient Mod on four tasks: image classification on Image Net1K (Deng et al., 2009), object detection and instance segmentation on MS COCO (Lin et al., 2014), and semantic segmentation on ADE20K (Zhou et al., 2017).
Dataset Splits Yes We evaluate the classification performance of Efficient Mod networks on Image Net-1K. Our training recipe follows the standard practice in Dei T (Touvron et al., 2021a), details can be found in Appendix Sec. 5.
Hardware Specification Yes GPU: We chose the P100 GPU for our latency evaluation... CPU: Some models may operate with unpredictable latency on different types of hardware... We also provide all models measured latency on the Intel(R) Xeon(R) CPU E5-2680 CPU for a full comparison.
Software Dependencies No The paper states 'We implement all networks in Py Torch' but does not specify version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes The detailed training hyper parameters are presented in Table 5. (Table 5 is actually Figure 5 in the paper PDF). The table includes: Batch size 256, Optimizer Adam W, Weight decay 0.05, Learning rate 4e-3, Epochs 300, Warmup epochs 5, Hflip 0.5, Color-jitter 0.4, Mixup 0.8, Cutmix 1.0, Label smoothing 0.1, Layer Scale 1e-4, Drop path {0., 0., 0.02}.