Contextual Convolutional Networks

Authors: Shuxian Liang, Xu Shen, Tongliang Liu, Xian-Sheng Hua

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTS; In the following, Contextual CNN is compared with the state of the arts (SOTAs) on three tasks, i.e., image classification, video classification and instance segmentation. We then ablate important design elements and analyze internal properties of the method via exemplification and visualizations. 4.1 IMAGE CLASSIFICATION ON IMAGENET-1K; 4.2 EMPIRICAL EVALUATION ON DOWNSTREAM TASKS; 4.3 ABLATION STUDY
Researcher Affiliation Collaboration Shuxian Liang1,2 , Xu Shen2, Tongliang Liu3, Xian-Sheng Hua1 1Zhejiang University, 2Alibaba Cloud Computing Ltd., 3Sydney AI Centre, The University of Sydney
Pseudocode No The paper describes the architecture and components in detail (Sections 3.1, 3.2, 3.3) but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code Yes Code is available at: https://github.com/liang4sx/contextual_cnn. Third, code of Contextual CNN is available at: https://github.com/liang4sx/contextual_cnn.
Open Datasets Yes Settings. For image classification, we benchmark Contextual CNN on Image Net-1K (Deng et al., 2009). Kinetics-400 (Kay et al., 2017) is a large-scale video action classification dataset... The instance segmentation experiments are conducted on COCO (Lin et al., 2014)...
Dataset Splits Yes For image classification, we benchmark Contextual CNN on Image Net-1K (Deng et al., 2009). It contains 1.28M training images and 50K validation images from 1, 000 classes.
Hardware Specification Yes Following Liu et al. (2021), inference throughput is measured on a V100 GPU.
Software Dependencies No The paper mentions using an 'Adam W optimizer' but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes Following Touvron et al. (2021); Liu et al. (2021; 2022), we train the model for 300 epochs using an Adam W optimizer (Loshchilov & Hutter, 2017) with a learning rate of 0.001. The batch size we use is 4, 096 and the weight decay is 0.05. We adopt the same augmentation and regularization strategies as Liu et al. (2022) in training.