Cross-Domain Kernel Induction for Transfer Learning

Authors: Wei-Cheng Chang, Yuexin Wu, Hanxiao Liu, Yiming Yang

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on benchmark datasets show advantageous performance of the proposed method over that of other state-of-the-art TL methods.
Researcher Affiliation Academia Wei-Cheng Chang Carnegie Mellon University wchang2@andrew.cmu.edu Carnegie Mellon University yuexinw@andrew.cmu.edu Hanxiao Liu Carnegie Mellon University hanxiaol@cs.cmu.edu Yiming Yang Carnegie Mellon University yiming@cs.cmu.edu
Pseudocode No The paper presents mathematical formulations and descriptions of algorithms but does not include structured pseudocode or an algorithm block.
Open Source Code No The paper provides links to the source code of baseline methods but does not provide concrete access to the source code for their own proposed 'Ker TL' method.
Open Datasets Yes Datasets Amazon Product Reviews (APR) The APR dataset (Prettenhofer and Stein 2010) was designed for evaluations of sentimental classification with transfer learning in crosslanguage and cross-domain settings. It consists of Amazon product reviews on books (B), DVDs (D) and music (M), and written in English (EN), German (GE), French (FR) and Japanese (JP). For each language on each product type (B, D or M), there are 2000 labeled reviews for training and 2000 labeled reviews for testing, respectively. Parallel data are also provided for each language pair, which we will describe with an example task in the next. MNIST Handwritten Images The MNIST dataset consists of 70, 000 images in total, with digits from 0 to 9 as the class labels (one per image). We follow the setting in (Chandar et al. 2015), to treat left half of each image (28 × 28 pixels) as a source-domain instance, while right half of the image as a target-domain instance. Raw pixel values are used as features. We randomly sampled 3, 000 images from the full set as the unlabeled parallel set, 2, 000 images as the source-domain training set, 1, 024 images as the targetdomain training set, and another of 2, 000 images as the test set (only the target-domain portion is used).
Dataset Splits Yes For each task in APR, we use the full set of 2000 source domain labeled instances, and a randomly sampled subset of m target domain labeled instances (m = 2, 4, 8, 16, 32) from the full set as the final training data. The remaining targetdomain labeled instances (2000 − m) are used for validation (hyper-parameter tuning).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions using 'L2-SVM from LIBLINEAR (Fan et al. 2008)' and links to its website, but it does not specify exact version numbers for LIBLINEAR or any other software components (e.g., programming languages, libraries, or operating systems) used for the experiments.
Experiment Setup Yes For all the methods using SVM classifiers (in SVM, HFA, MMDT, HHTL and Corr Net), we set the regularization parameter C = 1. For hyper-parameter tuning, we set the default hyperparameters of HFA and MMDT the same as in their papers. We adopted the hyper-parameter of HHTL on the APR data, with a grid search of the optimal regularization coefficient among λ = 0.001, 0.01, 1, 10, and 100, and the corruption probability among p = 0.5, 0.6, 0.7, 0.8, and 0.9 on the MNIST dataset. Similarly, for Corr Net on the MNIST dataset we used a grid search for the number of hidden units as 20, 50, 100, and 200, and λ = 0.2, 2, and 20 on the APR dataset. For Ker TL, we used the cosine similarity and RBF kernel on the APR and MNIST datasets, respectively. We keep the top 128 eigenvectors in the eigen-decomposition part for efficient computation, and set the regularization coefficient γ to be 2 × 10−4.