Identifying Useful Learnwares for Heterogeneous Label Spaces

Authors: Lan-Zhe Guo, Zhi Zhou, Yu-Feng Li, Zhi-Hua Zhou

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Both theoretical and empirical analyses show that our proposal can quickly and accurately find useful learnwares that satisfy users requirements. Moreover, Experimental results on more than 20 tasks show that our proposal not only identifies the most useful model for the user’s task but also the similarity ranking of the new proposal is closely related to the ranking of ground-truth model reuse performance.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China. Correspondence to: Yu-Feng Li <liyf@lamda.nju.edu.cn>.
Pseudocode Yes Algorithm 1 Specification Assignment. Input: Dataset {X, Y}, model f, feature extractor G( ). Output: Model specification S. And Algorithm 2 Identify Useful Learnwares. Input: User’s dataset {XT , YT }, M trained models {fm}M m=1, specifications {Sm}M m=1. Output: Identified model f.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the methodology is openly available.
Open Datasets Yes We adopt the NICO (He et al., 2021) and Domain Net (Peng et al., 2019) datasets to evaluate the proposed learnware paradigm.
Dataset Splits No The paper states, "For the user’s tasks, we assume there are only 10 labeled examples available for each class. We adopt the small labeled dataset to help generate specifications and fine-tune the selected trained model, leaving the others as the test data." This describes how a portion of the data is used for fine-tuning and specification generation, and the rest for testing, but it does not explicitly define a separate validation set for hyperparameter tuning or early stopping during training in a standard train/validation/test split.
Hardware Specification Yes In our experiments, we estimate the training time for fine-tuning a Res Net-18 on a single NVIDIA 3090 GPU card to be close to 48s.
Software Dependencies No The paper mentions specific models (ResNet18, DenseNet201) and optimizers (SGD) but does not provide specific version numbers for software libraries, programming languages, or other dependencies (e.g., Python version, PyTorch/TensorFlow version, CUDA version).
Experiment Setup Yes We train the model for 500 epochs using the SGD optimizer. The initial learning rate is 0.1, and we adopt the cosine annealing learning rate decay strategy. The feature extractor G( ) in our experiments is a Dense Net201 (Huang et al., 2017) pre-trained on the Image Net dataset. To generate the model specification, we train a linear model for 200 epochs using the SGD optimizer, and the learning rate is 0.1.