Model Spider: Learning to Rank Pre-Trained Models Efficiently

Authors: Yi-Kai Zhang, Ting-Ji Huang, Yao-Xiang Ding, De-Chuan Zhan, Han-Jia Ye

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate MODEL SPIDER on three benchmarks: the PTM zoo comprising heterogeneous models from the single-source, multi-source datasets, or composed of large language models. We analyze the influence of key components in MODEL SPIDER and visualize the ability of a PTM using spider charts based on the learned representation.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, China 2State Key Lab of CAD & CG, Zhejiang University
Pseudocode Yes Algorithm 1 The Training Part of the MODEL SPIDER
Open Source Code Yes Code is available at https://github.com/zhangyikaii/Model-Spider.
Open Datasets Yes We evaluate various methods on 9 downstream datasets, i.e. Aircraft [59], Caltech101 [32], Cars [47], CIFAR10 [49], CIFAR100 [49], DTD [19], Pet [73], and SUN397 [107] for classification, UTKFace [118] and d Sprites [61] for regression.
Dataset Splits Yes Specifically, we grid search the learning rates (7 learning rates from 10 1 to 10 4, logarithmically spaced) and weight decays (7 weight decays from 10 6 to 10 3, logarithmically spaced) to select the best hyper-parameter on the validation set and compute the accuracy on the downstream test set.
Hardware Specification Yes We build the model zoo with around 5K GPU hours (on NVIDIA V100 GPUs).
Software Dependencies No The paper mentions software like PyTorch and EsViT, but crucially, it does not provide specific version numbers for these or other libraries/solvers.
Experiment Setup Yes We meticulously conduct a grid-search of hyper-parameters, such as optimizers, learning rates, and weight decays (2 optimizers as SGD or Adam, 6 learning rates from 5 10 2 to 10 4, and 3 weight decay values from 5 10 4 to 10 5, batch size of 128, and the maximum epoch of 100).