Conditional Adversarial Domain Adaptation

Authors: Mingsheng Long, ZHANGJIE CAO, Jianmin Wang, Michael I. Jordan

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our models exceed state-of-the-art results on five benchmark datasets. ... We evaluate the proposed conditional domain adversarial networks with many state-of-the-art transfer learning and deep learning methods.
Researcher Affiliation Academia School of Software, Tsinghua University, China KLiss, MOE; BNRist; Research Center for Big Data, Tsinghua University, China University of California, Berkeley, Berkeley, USA {mingsheng, jimwang}@tsinghua.edu.cn caozhangjie14@gmail.edu jordan@berkeley.edu
Pseudocode No The paper contains mathematical formulations and descriptions of the model, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes Codes will be available at http://github.com/thuml/CDAN.
Open Datasets Yes Office-31 [42] is the most widely used dataset for visual domain adaptation... Image CLEF-DA1 is a dataset organized by selecting the 12 common classes shared by three public datasets... We investigate three digits datasets: MNIST, USPS, and Street View House Numbers (SVHN). Vis DA-20172 is a challenging simulation-to-real dataset... 1http://imageclef.org/2014/adaptation 2http://ai.bu.edu/visda-2017/
Dataset Splits Yes It contains over 280K images across 12 classes in the training, validation and test domains. ... We conduct importance-weighted cross-validation (IWCV) [48] to select hyper-parameters for all methods.
Hardware Specification No The paper states 'We implement Alex Net-based methods in Caffe and Res Net-based methods in Py Torch.' but does not provide any specific hardware details such as GPU models, CPU types, or memory.
Software Dependencies No The paper mentions software like 'Caffe' and 'Py Torch' but does not specify their version numbers or the versions of any other software dependencies, making the description not fully reproducible.
Experiment Setup Yes We fine-tune from Image Net pre-trained models [41]... We train the new layers and classifier layer through back-propagation, where the classifier is trained from scratch with learning rate 10 times that of the lower layers. We adopt mini-batch SGD with momentum of 0.9 and the learning rate annealing strategy as [13]: the learning rate is adjusted by ηp = η0(1 + αp) β, where p is the training progress changing from 0 to 1, and η0 = 0.01, α = 10, β = 0.75 are optimized by the importance-weighted cross-validation [48]. We adopt a progressive training strategy for the discriminator, increasing λ from 0 to 1 by multiplying to 1 exp( δp)/(1+exp( δp)), δ = 10. As CDAN performs stably under different parameters, we fix λ = 1 for all experiments.