Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Data-Efficient Structured Pruning via Submodular Optimization
Authors: Marwa El Halabi, Suraj Srinivas, Simon Lacoste-Julien
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results demonstrate that our method outperforms state-of-the-art methods in the limited-data regime. |
| Researcher Affiliation | Collaboration | Marwa El Halabi Samsung SAIT AI Lab, Montreal Suraj Srinivas Harvard University Simon Lacoste-Julien Mila, Université de Montreal Samsung SAIT AI Lab, Montreal |
| Pseudocode | Yes | Algorithm 1 GREEDY |
| Open Source Code | Yes | The code for reproducing all experiments is available at https://github.com/marwash25/subpruning. |
| Open Datasets | Yes | Le Net model [Le Cun et al., 1989] on the MNIST dataset [Lecun et al., 1998], and on the Res Net56 [He et al., 2016] and the VGG11 [Simonyan and Zisserman, 2015] models on the CIFAR-10 dataset [Krizhevsky et al., 2009]. |
| Dataset Splits | Yes | We report top-1 accuracy results evaluated on the validation set, as we vary the compression ratio (original size / pruned size). Unless otherwise specified, we use the per-layer budget selection method described in Section 5.2 for all the layerwise pruning methods... We set aside a subset of the training set to use as a verification set. |
| Hardware Specification | Yes | All experiments are run on an internal cluster with NVIDIA V100 or A100 GPUs. |
| Software Dependencies | No | The code is written in PyTorch [Paszke et al., 2017]. No specific version number for PyTorch or other software dependencies is provided. |
| Experiment Setup | Yes | To compute the gradients and activations used for pruning in LAYERSAMPLING, ACTGRAD, LAYERACTGRAD, and our method s variants, we use four batches of 128 training images, i.e., n = 512, which corresponds to 1% of the training data in MNIST and CIFAR10. All models are trained for 100 epochs using SGD with momentum 0.9, weight decay 5e-4, and initial learning rate 0.1, which is reduced by a factor of 10 at epochs 50 and 75. |