reproducibilityindex.ai

CoAtFormer: Vision Transformer with Composite Attention

Authors: Zhiyong Chang, Mingjun Yin, Yan Wang

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show our Co At Former achieves state-of-the-art results on various different tasks.
Researcher Affiliation	Collaboration	Zhiyong Chang1 , Mingjun Yin2 , Yan Wang3 1Peking University 2The University of Melbourne 3Zuoyebang
Pseudocode	No	The paper includes mathematical formulations (Equations 1-19) and architectural diagrams (Figures 2 and 3), but it does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include any explicit statement about releasing source code for the methodology or a link to a code repository.
Open Datasets	Yes	We conduct experiments on Image Net-1K [Deng et al., 2009] classification, COCO [Lin et al., 2014] object detection and instance segmentation, and ADE20K [Zhou et al., 2017] semantic segmentation.
Dataset Splits	Yes	We conduct experiments on Image Net-1K [Deng et al., 2009] classification, COCO [Lin et al., 2014] object detection and instance segmentation, and ADE20K [Zhou et al., 2017] semantic segmentation. For fair comparison, we follow the same training strategies as previous works [Touvron et al., 2020; Liu et al., 2021].
Hardware Specification	No	The paper mentions computational costs (e.g., FLOPs) and parameters but does not specify any hardware details like GPU/CPU models, memory, or specific cloud computing instances used for experiments.
Software Dependencies	No	The paper mentions software components such as 'Adam W optimizer', 'Mask R-CNN', 'Cascade Mask R-CNN', 'Uper Net', and 'GELU activation', but it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	For fair comparison, we follow the same training strategies as previous works [Touvron et al., 2020; Liu et al., 2021]. Specifically, we train all our models for 300 epochs with the input size of 224 224. We employ the Adam W optimizer with weight decay of 0.05. The default batch size and initial learning rate are set to 1024 and 0.001.