Learning to cluster in order to transfer across domains and tasks

Authors: Yen-Chang Hsu, Zhaoyang Lv, Zsolt Kira

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section contains evaluations with four image datasets and covers both cross-task and crossdomain schemes. The details are described below, and the differences between experimental settings are illustrated in appendix A.
Researcher Affiliation Academia Yen-Chang Hsu, Zhaoyang Lv Georgia Institute of Technology Atlanta, GA 30332, USA {yenchang.hsu, zhaoyang.lv}@gatech.edu Zsolt Kira Georgia Tech Research Institute Atlanta, GA 30318, USA zkira@gatech.edu
Pseudocode No The paper includes diagrams of network architectures (Figures 1-7) but does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain an explicit statement or link indicating that the authors' code for the described methodology is publicly available.
Open Datasets Yes The Omniglot dataset (Lake et al., 2015) contains 1623 different handwritten characters and each of them has 20 images drawn by different people. The 1000-class dataset is separated into 882-class (Image Net882) and 118-class (Image Net118) subsets as the random split in Vinyals et al. (2016). Office-31 (Saenko et al., 2010) has images from 31 categories of office objects. The 4652 images are obtained from three domains: Amazon (a), DSLR (d), and Webcam (w). We also evaluated the CCN+ on another widely compared scenario, which uses color Street View House Numbers images (SVHN) (Netzer et al., 2011) as S , the gray-scale hand-written digits (MNIST) (Le Cun, 1998) as T.
Dataset Splits Yes We use the Omniglotbg as the auxiliary dataset (A) and the Omnigloteval as the target data (T). The 1000-class dataset is separated into 882-class (Image Net882) and 118-class (Image Net118) subsets as the random split in Vinyals et al. (2016). We follow the standard protocols using deep neural networks (Long et al., 2017; Ganin et al., 2016) for unsupervised domain adaptation.
Hardware Specification No The paper does not explicitly mention any specific hardware (e.g., GPU models, CPU types, or cloud computing instances) used for running the experiments.
Software Dependencies No The CCN+/++ and DANN (Rev Grad) with Res Net backbone are implemented with Torch. We use the code from original author for JAN. (No version numbers are specified for Torch or other libraries).
Experiment Setup Yes Each mini-batch has a size of 100 and is sampled from a random 20 characters to make sure the amount of similar pairs is reasonable. The only hyper-parameter in LCO is σ, and we set it to 2 for all our experiments. The network is randomly initialized, trained end-to-end, and optimized by stochastic gradient descent with randomly sampled 100 images per mini-batch. Each mini-batch is constructed by 32 labeled samples from source and 96 unlabeled samples from target. The loss function used in our approach is equation (7) and is optimized by stochastic gradient descent.