Asymmetric Transfer Learning with Deep Gaussian Processes

Authors: Melih Kandemir

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on two applications. The first is a real-world image categorization benchmark, where the domains are different image data sets collected by cameras of different resolutions and for different purposes. The second is tumor detection in histopathology tissue slide images (G urcan et al., 2009). Our model reaches state-of-the-art prediction performance in the first application, and improves it in the second.
Researcher Affiliation Academia Melih Kandemir MELIH.KANDEMIR@IWR.UNI-HEIDELBERG.DE Heidelberg University, HCI/IWR
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The source code of our model is publicly available1. 1https://github.com/melihkandemir/atldgp
Open Datasets Yes We use the benchmark data set constructed by Saenko et al. (2010) for domain adaptation experiments, which consists of images of 10 categories... caltech: images chosen from the experimental Caltech 256 data set (Griffin et al., 2007)... The breast data set2 consists of 58 images... 2http://www.bioimage.ucsb.edu/research/ biosegmentation
Dataset Splits Yes We use the 800-dimensional SURF-Bo W features provided by Gong et al. (2012), and the 20 train/test splits provided by Hoffman et al. (2013a). Each train split consist of 20 images from the source domain for amazon and eight images for the other three domains, and three images from the target domain. All the remaining points in the target domain are left for the test split. We generate a training split by randomly choosing 200 instances from the source domain, and five instances from the target domain per class.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For all sparse GP components, we use 10 inducing points that are initialized to cluster centroids found by k-means... We set the inducing points of the first layer GPs to instances chosen from the training set at random, and learn them from data for the second layer GPs... We initialize er i and mc i to their least-squares fit to the predictive mean of the GP they belong... For all sparse GP models, we start the learning rate from 0.001, take a gradient step if it increases the lower bound, or multiply the learning rate by 0.9 otherwise. For all models that learn latent data representations... we set the latent dimensionality size to 20. For all kernel learners, we used an RBF kernel with isotropic covariance.