Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Pruning-Robust Mamba with Asymmetric Multi-Scale Scanning Paths

Authors: Jindi Lv, Yuhao Zhou, Mingjia Shi, Zhiyuan Liang, Panpan Zhang, Xiaojiang Peng, Wangbo Zhao, Zheng Zhu, Jiancheng Lv, Qing Ye, Kai Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiment 4.1 Datasets and Settings 4.2 Pruning Robustness Analysis 4.3 Image Classification 4.4 Semantic Segmentation 4.5 Ablation Study Empirical results demonstrate that AMVim achieves state-of-the-art pruning robustness. During token reduction, AMVim-T achieves a substantial 34% improvement in training-free accuracy with identical model sizes and FLOPs. Meanwhile, AMVim-S exhibits only a 1.5% accuracy drop, performing comparably to Vi T.
Researcher Affiliation	Collaboration	1Sichuan University 2University of Virginia 3National University of Singapore 4Giga AI
Pseudocode	No	The paper describes methods and processes through textual explanations and figures (e.g., Figure 3 illustrates the WMS3 block), but it does not contain a clearly labeled pseudocode or algorithm block.
Open Source Code	Yes	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: All data sources used in this study are clearly cited in the paper, and the code will be uploaded in a zipped format.
Open Datasets	Yes	We evaluate AMVim on the Image Net-1K dataset, which includes 1,000 object classes, 1.28 million training images, and 50,000 validation images. Images are augmented and resized to 224 224 for evaluation. This study focuses on the Image Net-1K classification task, and we report top-1 validation accuracy. We evaluate AMVim on the downstream semantic segmentation task using the ADE20K dataset [42], with results summarized in Table 4.
Dataset Splits	Yes	We evaluate AMVim on the Image Net-1K dataset, which includes 1,000 object classes, 1.28 million training images, and 50,000 validation images.
Hardware Specification	Yes	All experiments are conducted with 4 NVIDIA L40S GPUs.
Software Dependencies	No	The paper mentions "Adam W optimization" for the optimizer but does not specify version numbers for other key software components like Python, PyTorch, or CUDA libraries, which are essential for full reproducibility.
Experiment Setup	Yes	AMVim is fine-tuned for 150 epochs with Adam W optimization, initialized using the publicly available weights of Vim [6]. A batch size of 128 is used with two-step gradient accumulation, resulting in an effective total batch size of 1,024. Additional training details are listed in Table 9 in Appendix. Table 9: Training settings for AMVim on Image Net-1K. finetune config AMVim-T AMVim-S optimizer Adam W Adam W base learning rate 4e-5 1e-5 minimal learning rate 1e-5 5e-6 weight decay 1e-8 0.05 optimizer momentum β1,β2=0.9,0.999 β1,β2=0.9,0.999 batch size 1024 1024 training epochs 150 150 learning rate schedule cosine decay cosine decay warmup epochs 5 5 warmup learning rate 1e-5 1e-5 warmup schedule linear linear drop path 0 0.3 mixup 0.8 0.8 cutmix 1 1 EMA None None