Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Elastic ViTs from Pretrained Models without Retraining

Authors: Walter Simoncini, Michael Dorkenwald, Tijmen Blankevoort, Cees G. M. Snoek, Yuki Asano

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on DINO, Sig LIPv2, De IT, and Aug Reg models demonstrate superior performance over state-of-the-art methods across various sparsities, requiring less than five minutes on a single A100 GPU to generate elastic models that can be adjusted to any computational budget. Our key contributions include an efficient pruning strategy for pretrained Vision Transformers, a novel evolutionary approximation of Hessian off-diagonal structures, and a self-supervised importance scoring mechanism that maintains strong performance without requiring retraining or labels.
Researcher Affiliation	Collaboration	Walter Simoncini1,2 * Michael Dorkenwald2 * Tijmen Blankevoort3 Cees G.M. Snoek2 Yuki M. Asano1 1University of Technology Nuremberg 2University of Amsterdam 3NVIDIA
Pseudocode	Yes	The pseudocode for our algorithm is listed in Appendix D.2. Algorithm 1 outlines our single-shot pruning procedure.
Open Source Code	Yes	Code and pruned models are available at: https://elastic.ashita.nl/ Furthermore, we release the codebase used to run the experiments presented in this paper at https://github.com/Walter Simoncini/Snap Vi T.
Open Datasets	Yes	We investigate the performance of pruned models on 7 image classification datasets, namely Image Net-1k [55], FGVC Aircraft [41], Oxford-IIT Pets [49], DTD Textures [11], Euro SAT [26] and CIFAR 10/100 [33], plus Pascal VOC 2012 [17] for semantic segmentation. Table 4 lists all the datasets used in this paper alongside their license and citation.
Dataset Splits	Yes	We use the train/test splits defined by the dataset authors where possible, except for Euro SAT, for which we use an 80/20 stratified split as indicated by the dataset paper. We always report the performance on the test split, except for Image Net-1k and Pascal VOC, for which we report performance on the validation split. For the linear classification experiments we use the validation split defined by the dataset authors if available, and otherwise create one using an 80/20 random split.
Hardware Specification	Yes	The pruning experiments were run using a NVIDIA A100 GPU with 40GB of VRAM, 16 CPU cores, and 40 GB of RAM.
Software Dependencies	No	We evaluate pruned models in k-nearest neighbor classification using the implementation from scikit-learn [50].
Experiment Setup	Yes	We prune models to six target sparsities, namely 10, 20, 30, 40, 50, and 60% in one shot. To do so, we first estimate gradients using either a DINO or a cross-entropy loss and 1000 random samples from the Image Net-1k training set (unless specified otherwise) and batch size 16. Gradients are averaged over each batch and summed across batches. We do not use any data augmentation for the cross-entropy loss, and for the DINO loss, we only use random cropping to generate 2 global and 10 local crops, with scales between (0.25, 1.0) and (0.05, 0.25), respectively.