Towards Meta-Pruning via Optimal Transport
Authors: Alexander Theus, Olin Geimer, Friedrich Wicke, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark our results for various networks on commonly used datasets such as CIFAR-10, CIFAR-100, and Image Net.Here, we seek to illustrate the accuracy gains Intra-Fusion can achieve. ... we compare the test accuracy of a VGG11BN, Res Net18, Res Net50, on CIFAR-10, CIFAR-100, and Image Net. |
| Researcher Affiliation | Academia | Department of Computer Science ETH Zurich, Switzerland |
| Pseudocode | Yes | we show both meta-pruning approaches side by side (see Algorithm 1 and 2) to highlight the differences between the two. |
| Open Source Code | Yes | Our code is available here1. 1Github repository: https://github.com/alexandertheus/Intra-Fusion. |
| Open Datasets | Yes | We benchmark our results for various networks on commonly used datasets such as CIFAR-10, CIFAR-100, and Image Net. |
| Dataset Splits | Yes | In the approach we want to propose, we split the data set into two subsets a and b on which we then train two individual models modela, modelb in parallel.Specifically, see Performance Comparison: After Convergence (Appendix D.2) and Performance Comparison: Varying Fine-Tuning (Appendix D.3). An extension to combining more than the presented two datasets can be found in k-Fold Split-Data (Appendix D.4). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as exact GPU/CPU models or processor types. Figure 22 only vaguely mentions "Nvidia Gpu" without further specification. |
| Software Dependencies | No | The paper does not list specific version numbers for software dependencies such as Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | During the training of the used VGG11-BN and Resnet18 networks, we deploy the training hyperparameters in Table 1. For the fine-tuning of models after pruning we use the hyperparameters in Table 2. |