Transferable Normalization: Towards Improving Transferability of Deep Neural Networks

Authors: Ximei Wang, Ying Jin, Mingsheng Long, Jianmin Wang, Michael I. Jordan

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Trans Norm with two series of experiments: (a) Basic experiments of applying Trans Norm to the Res Net-50 backbone for seminal domain adaptation methods DANN [4] and CDAN [18] on three standard datasets; (b) Generalize to a wider variety of domain adaptation methods, more kinds of network backbones, and more challenging datasets. Experimental results indicate that our Trans Norm can unanimously improve the transferability of the state of the art domain adaptation methods over five standard datasets.
Researcher Affiliation Academia Ximei Wang, Ying Jin, Mingsheng Long (B) , Jianmin Wang, and Michael I. Jordan School of Software, BNRist, Tsinghua University, China Research Center for Big Data, Tsinghua University, China National Engineering Laboratory for Big Data Software University of California, Berkeley, Berkeley, USA {wxm17,jiny18}@mails.tsinghua.edu.cn {mingsheng,jimwang}@tsinghua.edu.cn jordan@cs.berkeley.edu
Pseudocode Yes Algorithm 1 Transferable Normalization (Trans Norm) Input: Values of x in a mini-batch from source domain Ds = {xs,i}m i=1 and target domain Dt = {xt,i}m i=1, the size m of the mini-batch and the number c of channels in each layer. Parameters shared across domains to be learned: γ, β. Output: {ys,i = Trans Normγ,β(xs,i)}, {yt,i = Trans Normγ,β(xt,i)}.
Open Source Code Yes The code of Trans Norm is available at http://github.com/thuml/Trans Norm.
Open Datasets Yes We use five domain adaptation datasets: Office-31 [31] with 31 categories and 4, 652 images collected from three domains: Amazon (A), DSLR (D) and Webcam (W); Image CLEF-DA with 12 classes shared by three public datasets (domains): Caltech-256 (C), Image Net ILSVRC 2012 (I), and Pascal VOC 2012 (P); Office-Home [39] with 65 classes and 15, 500 image from four significantly different domains: Artistic images (Ar), Clip Art (Cl), Product images (Pr) and Real-World images (Rw); Digits dataset with images from MNIST (M) and Street View House Numbers (SVHN, S); and Vis DA-2017 [27], a simulation-to-real dataset involving over 280K images from 12 categories.
Dataset Splits No The paper states 'We follow the standard protocols for unsupervised domain adaptation' and 'We conduct Deep Embedded Validation (DEV) [45] to select the hyper-parameters for all methods', but it does not provide specific percentages or sample counts for training/validation/test splits, nor does it explicitly cite a source for these specific splits as applied in their experiments.
Hardware Specification No The paper does not specify any hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No Our methods were implemented based on Py Torch, but no specific version numbers for PyTorch or other software dependencies are provided.
Experiment Setup No The paper mentions 'We conduct Deep Embedded Validation (DEV) [45] to select the hyper-parameters for all methods', but it does not explicitly provide the specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or other training configurations.