reproducibilityindex.ai

Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging

Authors: Max Zimmer, Christoph Spiegel, Sebastian Pokutta

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 3, we experimentally validate our findings across image classification, semantic segmentation, and neural machine translation architectures and datasets.
Researcher Affiliation	Academia	Max Zimmer1, Christoph Spiegel1 & Sebastian Pokutta1,2 1Department for AI in Society, Science, and Technology, Zuse Institute Berlin, Germany 2Institute of Mathematics, Technische Universit at Berlin, Germany {zimmer,spiegel,pokutta}@zib.de
Pseudocode	Yes	Figure 2: Left: Sketch of the algorithm for a single phase and m = 3. Right: Pseudocode for SMS.
Open Source Code	Yes	For reproducibility, our implementation is available at github.com/ZIB-IOL/SMS.
Open Datasets	Yes	We evaluate our approach on well-known datasets for image recognition, semantic segmentation, and neural machine translation (NMT), including Image Net-1K (Russakovsky et al., 2015), CIFAR-10/100 (Krizhevsky et al., 2009), Celeb-A (Liu et al., 2015), City Scapes (Cordts et al., 2016), WMT16 DE-EN (Bojar et al., 2016)
Dataset Splits	Yes	For validation, we use 10% of the training data.
Hardware Specification	No	The paper mentions 'FLOPs are computed using a single test batch' but does not specify any particular hardware used for running the experiments (e.g., specific GPU or CPU models, memory details).
Software Dependencies	No	The paper mentions using SGD as an optimizer and adapting code from the Shrink Bench framework, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	Table 3 shows the exact pretraining settings for each datasetarchitecture pair, reporting the number of epochs used for pretraining, the batch size, weight decay as well as the learning rate used. The exact retraining hyperparameters are specified explicitly in the descriptions of each experiment or in the corresponding subsection in Appendix B.