Understanding the Transferability of Representations via Task-Relatedness
Authors: Akshay Mehra, Yunbei Zhang, Jihun Hamm
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments using state-of-the-art pre-trained models show the effectiveness of task-relatedness in explaining transferability on various vision and language tasks. The efficient computability of task-relatedness even without labels of the target task and its high correlation with the model s accuracy after end-to-end fine-tuning on the target task makes it a useful metric for transferability estimation. Our empirical results of using task-relatedness on the problem of selecting the best pre-trained model from a model zoo for a target task highlight its utility for practical problems. |
| Researcher Affiliation | Academia | Akshay Mehra, Yunbei Zhang, and Jihun Hamm Tulane University {amehra, yzhang111, jhamm3}@tulane.edu |
| Pseudocode | Yes | Alg. 1 shows how we solve Eq. 4 (see App. D for additional details of the algorithm). Algorithm 1 Minimization of the bound in Theorem 3 Input: Reference task samples and labels (ZR, YR), Target task samples (ZT ), Target task labels (YT ) (optional). Output: Estimate of task-relatedness using the learned transformations A, A, B, D. |
| Open Source Code | Yes | Our codes can be found at https://github.com/akshaymehra24/Task Transfer Analysis. |
| Open Datasets | Yes | Aircraft [35]: consists of 10,000 aircraft images belonging to 100 classes. CIFAR-10/100 [30]: These datasets contain 60,000 images belonging to 10/100 categories. DTD[13]: consists of 5,640 textural images belonging to 47 categories. Fashion MNIST [60]: consists of 70,000 grayscale images belonging to 10 categories. Pets [43]: consists of 7049 images of Cats and Dogs spread across 37 categories. Image Net [17]: consists of 1.1 million images belonging to 1000 categories. Yelp [63]: consists of 650,000 training and 50,000 test examples belonging to 5 classes. Stanford Sentiment Treebank (SST-5) [63]: consists of 8,544 training and 2,210 test samples belonging to 5 classes. AG News [63]: consists of 120,000 training and 7,600 test examples belonging to 4 classes DBPedia [63]: consists of 560,000 training and 70,000 test examples belonging to 14 classes |
| Dataset Splits | Yes | Yelp [63]: consists of 650,000 training and 50,000 test examples belonging to 5 classes. For evaluation, we use a similar subsample of the validation dataset of Image Net containing all the samples belonging to the subsampled classes. |
| Hardware Specification | Yes | All codes are written in Python using Tensorflow/Pytorch and were run on an Intel(R) Xeon(R) Platinum 8358 CPU with 200 GB of RAM and an Nvidia A10 GPU. |
| Software Dependencies | No | The paper mentions using "Python", "Tensorflow", and "Pytorch" but does not specify their version numbers or the versions of any other key libraries or dependencies required for reproduction. |
| Experiment Setup | Yes | Implementation and hyperparameters are described below. All codes are written in Python using Tensorflow/Pytorch... We use a batch size of 1000 for Res Net18 models (representation dimension 512) and 2500 for Res Net50 models (representation dimension 2048). Fine-tuning runs for a total of 5000 epochs. Linear classifiers are trained with a gradient norm penalty (with τ = 0.02). |