A Mathematical Framework for Quantifying Transferability in Multi-source Transfer Learning

Authors: Xinyi Tong, Xiangxiang Xu, Shao-Lun Huang, Lizhong Zheng

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, experiments on image classification tasks show that our approach outperforms existing transfer learning algorithms in multi-source and few-shot scenarios. 4 Experiments To validate the effectiveness of our algorithms in multi-source learning and few-shot transfer learning scenarios, we conduct a series of experiments on common datasets for image recognition, including CIFAR-10 [14], Office-31 and Office-Caltech [15].
Researcher Affiliation Academia Xinyi Tong Tsinghua-Berkeley Shenzhen Institute Tsinghua University txy18@mails.tsinghua.edu.cn Xiangxiang Xu Massachusetts Institute of Technology xuxx@mit.edu Shao-Lun Huang Tsinghua-Berkeley Shenzhen Institute Tsinghua University shaolun.huang@sz.tsinghua.edu.cn Lizhong Zheng Massachusetts Institute of Technology lizhong@mit.edu
Pseudocode Yes Algorithm 1 Multi-Source Knowledge Transfer Algorithm 1: Input: target and source data samples {(x(i) l , y(i) l )}ni l=1 (i = 0, , k) 2: Randomly initialize α 3: repeat 4: (f , g ) arg minf,g L(α ,f,g) 5: α arg minα Ak L(α) test 6: until α converges 7: (f , g ) arg minf,g L(α ,f,g) 8: return f , g
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See supplemental material.
Open Datasets Yes We conduct multi-source transfer learning experiments on CIFAR-10 [14], which contains 50 000 training images and 10 000 testing images in 10 classes. ... Office-31 and Office-Caltech [15].
Dataset Splits No The paper mentions training and testing data but does not explicitly specify a validation set split or methodology for it.
Hardware Specification No Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] We use few computation resources in our work.
Software Dependencies No In our implementation of Algorithm 1, we use the CVXPY [17, 18] package for solving the non-negative quadratic programming in line 5.
Experiment Setup Yes Moreover, for each source task, 2000 images are used for training, with 1000 images per binary class, and we set target sample size n0 to n0 = 6, 20, 100, respectively. Throughout this experiment, the feature f is of dimensionality d = 10, generated by Goog Le Net [16], followed by two fully connected layers for further dimension reduction. ... In addition, the alternating iteration is stopped when the element-wise differences for α computed in two successive iterations are at most 0.05.