Scalable Diverse Model Selection for Accessible Transfer Learning

Authors: Daniel Bolya, Rohit Mittapalli, Judy Hoffman

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We formalize this setting as Scalable Diverse Model Selection and propose several benchmarks for evaluating on this task. We find that existing model selection and transferability estimation methods perform poorly here and analyze why this is the case. We then introduce simple techniques to improve the performance and speed of these algorithms. Finally, we iterate on existing methods to create PARC, which outperforms all other methods on diverse model selection.
Researcher Affiliation Academia Daniel Bolya Georgia Tech dbolya@gatech.edu Rohit Mittapalli Georgia Tech rmittapalli3@gatech.edu Judy Hoffman Georgia Tech judy@gatech.edu
Pseudocode No The paper describes methods using prose and mathematical formulas, but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks/sections, nor does it present structured code-like steps.
Open Source Code Yes We have released all benchmarks and evaluation code at https://dbolya.github.io/parc/ in hopes to further the development of this promising area of research.
Open Datasets Yes Thus, for this benchmark, we choose 6 well-known classification datasets of various difficulties that contain related subthemes: Pets: Stanford Dogs [26] and Oxford Pets [35], Birds: CUB200 [55] and NA Birds [54], and Miscellaneous: CIFAR10 [29] and Caltech101 [14]. We also include VOC2007 [13] and Image Net 1k [7] as the 7th and 8th source datasets, but not as targets.
Dataset Splits No The paper mentions using a "probe set" of n=500 images for model selection (a form of validation for the selection method), but it does not specify explicit training/validation splits for the final fine-tuning of models using the larger target training data. While it states "employ grid search to find optimal hyperparameters", the details of how the data was split for this purpose (e.g., percentages or counts for a validation set) are not provided.
Hardware Specification Yes All models are trained on Titan Xp GPUs and all transferability methods are evaluated on the CPU.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or other library versions) that would be needed for reproducibility.
Experiment Setup No The paper mentions general settings such as images being resized to 224x224 and using SGD for training with grid search for optimal hyperparameters, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) used in the experiments.