PAC-Net: A Model Pruning Approach to Inductive Transfer Learning
Authors: Sanghoon Myung, In Huh, Wonik Jang, Jae Myung Choe, Jisu Ryu, Daesin Kim, Kee-Eung Kim, Changwook Jeong
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Under the various and extensive set of inductive transfer learning experiments, we show that our method achieves state-of-the-art performance by a large margin. |
| Researcher Affiliation | Collaboration | 1CSE Team, Innovation Center, Samsung Electronics 2Kim Jaechul Graduate School of AI, KAIST 3Graduate School of Semicondutor Materials and Devices Engineering, UNIST. |
| Pseudocode | Yes | Algorithm 1: Pruning |
| Open Source Code | No | No explicit statement about the release of source code or a link to a code repository was found in the paper. |
| Open Datasets | Yes | Friedman #1 (Friedman, 1991) is a well-known regression problem and (Pardoe & Stone, 2010) modified the problem for inductive transfer learning. and Celeb Faces Attributes Dataset (Celeb A) (Liu et al., 2018b) is a large-scale celebrity images with the forty attribute annotations. |
| Dataset Splits | No | We used a half of the images for the train dataset and the rest for the test dataset. (Appendix A.2 Celeb A) and i) there are 20,000 datasets, half of which are training sets, and the rest are test sets. (Appendix A.1 Regression). The paper specifies train and test splits, but no explicit validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions optimizers (Adam), architectures (ResNet-18), and methods (Runge-Kutta), but does not provide specific software library names with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | For PAC-Net, we set pruning ratio to 0.8 and λ to 0.01. For L2-SP, we set the L2 strength of convolution layers to 0.01. (Appendix A.2 Celeb A) and The model is trained for 10,000 epochs with a batch size of 128, where Adam optimizer is performed with the step size 10 4. (Appendix A.5 Real-world Problem) |