Unsupervised Domain Adaptation by Backpropagation

Authors: Yaroslav Ganin, Victor Lempitsky

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Below, we detail the proposed approach to domain adaptation in deep architectures, and present results on traditional deep learning image datasets (such as MNIST (Le Cun et et al., 1998) and SVHN (Netzer et al., 2011)) as well as on OFFICE benchmarks (Saenko et al., 2010), where the proposed method considerably improves over previous state-of-the-art accuracy.
Researcher Affiliation Academia Yaroslav Ganin GANIN@SKOLTECH.RU Victor Lempitsky LEMPITSKY@SKOLTECH.RU Skolkovo Institute of Science and Technology (Skoltech), Moscow Region, Russia
Pseudocode No The paper describes the mathematical formulation and behavior of the gradient reversal layer but does not include a structured pseudocode block or algorithm.
Open Source Code Yes To this end we release the source code for the Gradient Reversal layer along with the usage examples as an extension to Caffe (Jia et al., 2014) (see (Ganin & Lempitsky, 2015)).
Open Datasets Yes These include large-scale datasets of small images popular with deep learning methods, and the OFFICE datasets (Saenko et al., 2010), which are a de facto standard for domain adaptation in computer vision, but have much fewer images. MNIST dataset (Le Cun et al., 1998), Street-View House Number dataset SVHN (Netzer et al., 2011).
Dataset Splits No The paper mentions 'standard training-test splits' for some datasets and states 'all training images are used for unsupervised adaptation', and for SYN SIGNS, '31,367 random training samples for unsupervised adaptation and the rest for evaluation.' While a 'validation error' is shown in a figure, explicit details about the validation dataset split (e.g., percentages or exact counts for a validation set distinct from training/test) are not provided in the main text.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'Caffe package' but does not specify its version number or any other software dependencies with their specific versions.
Experiment Setup Yes The model is trained on 128-sized batches. Images are preprocessed by the mean subtraction. A half of each batch is populated by the samples from the source domain (with known labels), the rest is comprised of the target domain (with unknown labels). In order to suppress noisy signal from the domain classifier at the early stages of the training procedure instead of fixing the adaptation factor λ, we gradually change it from 0 to 1 using the following schedule: λp = 2 / (1 + exp(−γp)) − 1, where γ was set to 10 in all experiments (the schedule was not optimized/tweaked).