Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning

Authors: Utku Evci, Vincent Dumoulin, Hugo Larochelle, Michael C Mozer

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In evaluations on the Visual Task Adaptation Benchmark (VTAB), Head2Toe matches performance obtained with fine-tuning on average while reducing training and storage cost a hundred fold or more, but critically, for out-of-distribution transfer, Head2Toe outperforms fine-tuning1.
Researcher Affiliation Industry 1Google Research, Brain Team. Correspondence to: Utku Evci <evcu@google.com>.
Pseudocode No The paper describes its method in prose and mathematical equations (e.g., Equation 1 and 2), but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code Yes We open source our code at https://github.com/ google-research/head2toe
Open Datasets Yes In our experiments, we use source models pretrained on Image Net2012 (Russakovsky et al., 2015)... Visual Task Adaptation Benchmark-1k (Zhai et al., 2019) to evaluate different methods. VTAB-1k consists of 19 different classification tasks, each having between 2 to 397 classes and a total of 1000 training examples.
Dataset Splits Yes We perform five-fold cross validation for each task and method in order to pick the best hyperparameters. We pick hyperparameters for each VTAB task separately by doing a 5-fold cross validation on the training data.
Hardware Specification No The paper does not explicitly state the specific hardware used for running its experiments, such as GPU models, CPU models, or cloud computing instance types. It refers to general training processes but lacks hardware specifications.
Software Dependencies No The paper refers to common model architectures like 'Res Net-50' and 'Vi T-B/16' but does not specify the software dependencies with version numbers (e.g., TensorFlow 2.x, PyTorch 1.x, Python 3.x) required to replicate the experiment.
Experiment Setup Yes All methods search over the same learning rates and training steps (two values of each). More details on hyperparameter selection and values used are shared in Appendix A. For HEAD2TOE we choose ℓ2,1 regularization coefficients from (0.001, 0.00001) and target feature sizes from (1024, 16384, 40000) for Res Net-50 and (768, 15360, 32448) for Vi T-B/16.