Asymmetric Tri-training for Unsupervised Domain Adaptation
Authors: Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our method using digit classification tasks, traffic sign classification tasks, and sentiment analysis tasks using the Amazon Review dataset, and demonstrated its state-of-the-art performance for nearly all of the conducted experiments. In particular, for the adaptation scenario, MNIST SVHN, our method outperformed other methods by more than 10%. |
| Researcher Affiliation | Academia | 1The University of Tokyo, Tokyo, Japan 2RIKEN, Japan. |
| Pseudocode | Yes | Algorithm 1 iter denotes the iteration of the training. The function Labeling indicates the labeling method. We assign pseudo-labels to samples when the predictions of F1 and F2 agree, and at least one of them is confident of their predictions. Input: data Xs = { (xi, ti) }m i=1, Xt = { (xj) }n j=1 Xtl = for j = 1 to iter do Train F, F1, F2, Ft with a mini-batch from the training set S end for Nt = Ninit Xtl = Labeling(F, F1, F2,Xt, Nt) L = Xs Xtl for K steps do for j = 1 to iter do Train F, F1, F2 with mini-batch from training set L Train F, Ft with mini-batch from training set Xtl end for Xtl = , Nt = K/20 n Xtl = Labeling(F, F1, F2,Xt, Nt) L = Xs Xtl end for |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code, nor does it include any links to a code repository for the methodology described. |
| Open Datasets | Yes | The digit datasets include MNIST (Le Cun et al., 1998), MNIST-M (Ganin & Lempitsky, 2014), Street View House Numbers (SVHN) (Netzer et al., 2011), and Synthetic Digits (SYN DIGITS) (Ganin & Lempitsky, 2014). We further evaluated our method on traffic sign datasets including Synthetic Traffic Signs (SYN SIGNS) (Moiseev et al., 2013) and the German Traffic Sign Recognition Benchmark (Stallkamp et al., 2011) (GTSRB). |
| Dataset Splits | Yes | From 59,001 target training samples, we randomly selected 1,000 labeled target samples as a validation split and tuned the hyper-parameters. In both settings, we use 1,000 labeled target samples to find the optimal hyperparameters. In addition, we used 1,000 SVHN samples as the validation set. A total of 3,000 labeled target samples were used for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | The paper mentions optimizers like 'Momentum SGD' and 'Adagrad' but does not specify any software libraries or frameworks with version numbers (e.g., 'PyTorch 1.9', 'TensorFlow 2.x'). |
| Experiment Setup | Yes | In our experiments on the image datasets, we employed the architecture of CNN used in (Ganin & Lempitsky, 2014). For a fair comparison, we separated the network at the hidden layer from which (Ganin & Lempitsky, 2014) constructed discriminator networks. Based on a validation, we set the threshold value for the labeling method as 0.95 in MNIST SVHN. In other scenarios, we set it as 0.9. We used Momentum SGD for optimization, and set the momentum as 0.9, whereas the learning rate was set 0.01. λ was set to 0.01 for all scenarios based on our validation. For our experiments on the Amazon Review dataset, we used a similar architecture to that used in (Ganin et al., 2016): with the sigmoid activated, one dense hidden layer with 50 hidden units, and a softmax output. |