Strengthening Layer Interaction via Dynamic Layer Attention
Authors: Kaishen Wang, Xun Xia, Jian Liu, Zhang Yi, Tao He
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate the effectiveness of the proposed DLA architecture, outperforming other state-of-the-art methods in image recognition and object detection tasks. |
| Researcher Affiliation | Academia | Kaishen Wang1 , Xun Xia 2 , Jian Liu2 , Zhang Yi1 , Tao He1, 1College of Computer Science, Sichuan University 2Clinical Medical College and The First Affiliated Hospital of Chengdu Medical College wangks@stu.scu.edu.cn, xiaxun@cmc.edu.cn, liujiansh@126.com, {zhangyi, tao he}@scu.edu.cn |
| Pseudocode | No | The paper describes the workflow of the DSU block and provides architectural diagrams (Figure 3), but it does not include formal pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/tunantu/ Dynamic-Layer-Attention. |
| Open Datasets | Yes | We conducted experiments on the CIFAR-10, CIFAR-100, and Image Net-1K datasets using Res Nets [He et al., 2016] as the backbone network for image classification. In the context of object detection, our approach was evaluated on the COCO2017 dataset using the Faster R-CNN [Ren et al., 2015] and Mask R-CNN [He et al., 2017] frameworks as detectors. |
| Dataset Splits | Yes | For the CIFAR-10 and CIFAR-100 datasets, we employed standard data augmentation strategies [Huang et al., 2016]. The training process involved random horizontal flipping of images, padding each side by 4 pixels, and then randomly cropping to 32 x 32. For the Image Net-1K dataset, we adopted the same data augmentation strategy and hyperparameter settings outlined in [He et al., 2016] and [He et al., 2019]. During training, images were randomly cropped to 224 x 224 with horizontal flipping. In the testing phase, images were resized to 256 x 256, then centrally cropped to a final size of 224 x 224. The object detection results on the COCO val2017 with Faster R-CNN and Mask R-CNN. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or detailed computer specifications used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like 'SGD optimizer', 'Faster R-CNN', 'Mask R-CNN', and 'MMDetection toolkit', but it does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | For the CIFAR-10 and CIFAR-100 datasets, training hyperparameters such as batch size, initial learning rate, and weight decay followed the recommendations of the original Res Nets [He et al., 2016]. For the Image Net-1K dataset, The optimization process utilized an SGD optimizer with a momentum of 0.9 and weight decay of 1e-4. The initial learning rate was set to 0.1 and decreased according to the Multi Step LR schedule over 100 epochs for a batch size of 256. In the context of object detection, The optimization process employed SGD with a weight decay of 1e-4, momentum of 0.9, and a batch size of 8. The models underwent training for a total of 12 epochs, starting with an initial learning rate of 0.01. Learning rate adjustments occurred at the 8th and 11th epochs, with a reduction by a factor of 10 each time. |