Linear Context Transform Block
Authors: Dongsheng Ruan, Jun Wen, Nenggan Zheng, Min Zheng5553-5560
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that the LCT block outperforms the SE block in image classification task on the Image Net and object detection/segmentation on the COCO dataset with different backbone models. Moreover, LCT yields consistent performance gains over existing state-of-the-art detection architectures, e.g., 1.5 1.7% APbbox and 1.0% 1.2% APmask improvements on the COCO benchmark, irrespective of different baseline models of varied capacities. |
| Researcher Affiliation | Academia | 1Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, Zhejiang, China 2College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang, China 3State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, School of Medicine, Zhejiang University, Hangzhou,Zhejiang, China {21530003, junwen, zng, minzheng}@zju.edu.cn |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper does not contain any explicit statement about releasing the source code for the Linear Context Transform block or providing a link to its repository. |
| Open Datasets | Yes | Image Net-1K (Russakovsky et al. 2015) |
| Dataset Splits | Yes | The Image Net 2012 dataset contains 1.28 million training images and 50K validation images with 1000 classes. We train using 118k train images and evaluate on 5k val images. |
| Hardware Specification | No | The paper mentions training on '4 GPUs' but does not specify the exact GPU models or any other specific hardware components used for the experiments. |
| Software Dependencies | No | All experiments are implemented with mmdetection framework (Chen et al. 2019). This mentions a framework but not specific version numbers for key software components like Python, PyTorch, CUDA, etc. |
| Experiment Setup | Yes | We train all models from scratch on 4 GPUs for 100 epochs, using synchronous SGD optimizer with a weight decay of 0.0001 and momentum 0.9. The initial learning rate is set to 0.1, and decreases by a factor of 0.1 every 30 epochs. ... For ResNet50 backbone, the total batch size is set as 256. For ResNet101 backbone, we reduce the batch size to 220... G is set as 64 by default. We train on 4 GPUs with 1 images per each for 12 epochs. All models are trained using synchronized SGD with a weight decay of 1e-4 and momentum of 0.9. According to the linear scaling rule (Goyal et al. 2017), the initial learning rate is set to 0.005, which is decreased by 10 at the 9th and 12th epochs. |