Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Stochastic Layer-Wise Shuffle for Improving Vision Mamba Training

Authors: Zizheng Huang, Haoxing Chen, Jiaqi Li, Jun Lan, Huijia Zhu, Weiqiang Wang, Limin Wang

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted extensive experiments to evaluate Vim training, exploring non-hierarchical models trained via supervised classification and pre-training paradigms, assessing their downstream task performance, and performing detailed algorithm analysis through ablation studies.
Researcher Affiliation	Collaboration	1State Key Lab of Novel Software Technology, Nanjing University 2Shanghai Innovation Institute 3Independent Researcher 4China Mobile Research Institute 5Shanghai AI Lab. Correspondence to: Limin Wang <EMAIL>.
Pseudocode	Yes	Algorithm 1 Layer-Wise Shuffle forward
Open Source Code	Yes	Code and models are available at the open source URL.
Open Datasets	Yes	For supervised training, we train from scratch on Image Net1K (Deng et al., 2009), which contains 1.28 million samples for the classification task. We conduct semantic segmentation experiments on ADE20K, detection and instance segmentation on COCO2017 benchmark.
Dataset Splits	Yes	For supervised training, we train from scratch on Image Net1K (Deng et al., 2009), which contains 1.28 million samples for the classification task. For segmentation experiment, we adopt the UPer Net (Xiao et al., 2018) head on Image Net-1K trained models. For downstream object detection and instance segmentation tasks, we follow previous work to evaluate our method. The Mask R-CNN (He et al., 2017) structure is adopted with 1 schedule for 12-epoch fine-tuning.
Hardware Specification	No	No specific hardware details (like GPU/CPU models) were mentioned for running experiments. The paper only discusses computational overhead with throughput measurements at various resolutions.
Software Dependencies	No	The paper mentions 'Adam W optimizer' and 'Py Torch pseudo-code' but does not provide specific version numbers for these or any other software components.
Experiment Setup	Yes	Table A.1: Supervised training implementation settings. Table A.2: Pre-training training implementation settings.