Identifying Useful Learnwares for Heterogeneous Label Spaces
Authors: Lan-Zhe Guo, Zhi Zhou, Yu-Feng Li, Zhi-Hua Zhou
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Both theoretical and empirical analyses show that our proposal can quickly and accurately find useful learnwares that satisfy users requirements. Moreover, Experimental results on more than 20 tasks show that our proposal not only identifies the most useful model for the user’s task but also the similarity ranking of the new proposal is closely related to the ranking of ground-truth model reuse performance. |
| Researcher Affiliation | Academia | 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China. Correspondence to: Yu-Feng Li <liyf@lamda.nju.edu.cn>. |
| Pseudocode | Yes | Algorithm 1 Specification Assignment. Input: Dataset {X, Y}, model f, feature extractor G( ). Output: Model specification S. And Algorithm 2 Identify Useful Learnwares. Input: User’s dataset {XT , YT }, M trained models {fm}M m=1, specifications {Sm}M m=1. Output: Identified model f. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | Yes | We adopt the NICO (He et al., 2021) and Domain Net (Peng et al., 2019) datasets to evaluate the proposed learnware paradigm. |
| Dataset Splits | No | The paper states, "For the user’s tasks, we assume there are only 10 labeled examples available for each class. We adopt the small labeled dataset to help generate specifications and fine-tune the selected trained model, leaving the others as the test data." This describes how a portion of the data is used for fine-tuning and specification generation, and the rest for testing, but it does not explicitly define a separate validation set for hyperparameter tuning or early stopping during training in a standard train/validation/test split. |
| Hardware Specification | Yes | In our experiments, we estimate the training time for fine-tuning a Res Net-18 on a single NVIDIA 3090 GPU card to be close to 48s. |
| Software Dependencies | No | The paper mentions specific models (ResNet18, DenseNet201) and optimizers (SGD) but does not provide specific version numbers for software libraries, programming languages, or other dependencies (e.g., Python version, PyTorch/TensorFlow version, CUDA version). |
| Experiment Setup | Yes | We train the model for 500 epochs using the SGD optimizer. The initial learning rate is 0.1, and we adopt the cosine annealing learning rate decay strategy. The feature extractor G( ) in our experiments is a Dense Net201 (Huang et al., 2017) pre-trained on the Image Net dataset. To generate the model specification, we train a linear model for 200 epochs using the SGD optimizer, and the learning rate is 0.1. |