Cross-Layer Retrospective Retrieving via Layer Attention
Authors: Yanwen Fang, Yuxi CAI, Jintai Chen, Jingyu Zhao, Guangjian Tian, Guodong Li
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Its effectiveness has been extensively evaluated in image classification, object detection and instance segmentation tasks, where improvements can be consistently observed. |
| Researcher Affiliation | Collaboration | 1Department of Statistics & Actuarial Science, The University of Hong Kong 2College of Computer Science and Technology, Zhejiang University 3Huawei Noah s Ark Lab |
| Pseudocode | Yes | Pseudo codes of MRLA-base s and MRLA-light s implementations in CNNs and vision transformers are given below. |
| Open Source Code | Yes | Our code is available at https://github.com/joyfang1106/MRLA. |
| Open Datasets | Yes | We use the middle-sized Image Net-1K dataset (Deng et al., 2009) directly. |
| Dataset Splits | Yes | Table 1: Comparisons of single-crop accuracy on the Image Net-1K validation set. |
| Hardware Specification | Yes | All models are implemented by Py Torch toolkit on 4 V100 GPUs. |
| Software Dependencies | No | The paper mentions software like PyTorch, MMDetection, and timm, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Specifically, the input images are randomly cropped to 224 × 224 with random horizontal flipping. The networks are trained from scratch using SGD with momentum of 0.9, weight decay of 1e-4, and a mini-batch size of 256. The models are trained within 100 epochs by setting the initial learning rate to 0.1, which is decreased by a factor of 10 per 30 epochs. |