Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Cross-Layer Retrospective Retrieving via Layer Attention
Authors: Yanwen Fang, Yuxi CAI, Jintai Chen, Jingyu Zhao, Guangjian Tian, Guodong Li
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Its effectiveness has been extensively evaluated in image classification, object detection and instance segmentation tasks, where improvements can be consistently observed. |
| Researcher Affiliation | Collaboration | 1Department of Statistics & Actuarial Science, The University of Hong Kong 2College of Computer Science and Technology, Zhejiang University 3Huawei Noah s Ark Lab |
| Pseudocode | Yes | Pseudo codes of MRLA-base s and MRLA-light s implementations in CNNs and vision transformers are given below. |
| Open Source Code | Yes | Our code is available at https://github.com/joyfang1106/MRLA. |
| Open Datasets | Yes | We use the middle-sized Image Net-1K dataset (Deng et al., 2009) directly. |
| Dataset Splits | Yes | Table 1: Comparisons of single-crop accuracy on the Image Net-1K validation set. |
| Hardware Specification | Yes | All models are implemented by Py Torch toolkit on 4 V100 GPUs. |
| Software Dependencies | No | The paper mentions software like PyTorch, MMDetection, and timm, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | Specifically, the input images are randomly cropped to 224 × 224 with random horizontal flipping. The networks are trained from scratch using SGD with momentum of 0.9, weight decay of 1e-4, and a mini-batch size of 256. The models are trained within 100 epochs by setting the initial learning rate to 0.1, which is decreased by a factor of 10 per 30 epochs. |