Towards Meta-Pruning via Optimal Transport

Authors: Alexander Theus, Olin Geimer, Friedrich Wicke, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark our results for various networks on commonly used datasets such as CIFAR-10, CIFAR-100, and Image Net.Here, we seek to illustrate the accuracy gains Intra-Fusion can achieve. ... we compare the test accuracy of a VGG11BN, Res Net18, Res Net50, on CIFAR-10, CIFAR-100, and Image Net.
Researcher Affiliation Academia Department of Computer Science ETH Zurich, Switzerland
Pseudocode Yes we show both meta-pruning approaches side by side (see Algorithm 1 and 2) to highlight the differences between the two.
Open Source Code Yes Our code is available here1. 1Github repository: https://github.com/alexandertheus/Intra-Fusion.
Open Datasets Yes We benchmark our results for various networks on commonly used datasets such as CIFAR-10, CIFAR-100, and Image Net.
Dataset Splits Yes In the approach we want to propose, we split the data set into two subsets a and b on which we then train two individual models modela, modelb in parallel.Specifically, see Performance Comparison: After Convergence (Appendix D.2) and Performance Comparison: Varying Fine-Tuning (Appendix D.3). An extension to combining more than the presented two datasets can be found in k-Fold Split-Data (Appendix D.4).
Hardware Specification No The paper does not provide specific details about the hardware used for experiments, such as exact GPU/CPU models or processor types. Figure 22 only vaguely mentions "Nvidia Gpu" without further specification.
Software Dependencies No The paper does not list specific version numbers for software dependencies such as Python, PyTorch, or CUDA.
Experiment Setup Yes During the training of the used VGG11-BN and Resnet18 networks, we deploy the training hyperparameters in Table 1. For the fine-tuning of models after pruning we use the hyperparameters in Table 2.