f-Domain Adversarial Learning: Theory and Algorithms
Authors: David Acuna, Guojun Zhang, Marc T. Law, Sanja Fidler
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental analysis conducted on real world natural language and computer vision datasets show that our framework outperforms existing baselines, and obtains the best results for f-divergences that were not considered previously in domain-adversarial learning. |
| Researcher Affiliation | Collaboration | 1NVIDIA 2University of Toronto 3Vector Institute 4University of Waterloo. |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks, only descriptive text and a framework diagram (Figure 1). |
| Open Source Code | No | The paper does not provide any links to open-source code or explicit statements about code availability. |
| Open Datasets | Yes | We evaluate our method on two digits datasets MNIST and USPS with two transfer tasks (M U and U M). We adopt the splits and evaluation protocol from (Long et al., 2018)... We use two visual benchmarks: (1) the Office31 dataset (Saenko et al., 2010) contains 4,652 images and 31 categories... (2) the Office-Home dataset (Venkateswara et al., 2017) contains 15,500 images... For this task, we consider the Amazon product reviews dataset (Blitzer et al., 2006) which contains online reviews of different products collected on the Amazon website. |
| Dataset Splits | Yes | For the Digits datasets... We adopt the splits and evaluation protocol from (Long et al., 2018) which constitute of 60,000 and 7,291 training images and the standard test set of 10,000 and 2,007 test images for MNIST and USPS, respectively. For the first two tasks, hyperparameters are determined based on a subset (10%) of the training set for one task (e.g. M U and B D) and kept constant for the others. For each task, we use predefined sets of 2000 instances of source and target data samples for training, and keep 4000 instances of the target domain for testing. |
| Hardware Specification | Yes | Experiments were carried out using a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with version numbers. |
| Experiment Setup | Yes | Implementation Details: We implement our algorithm in Py Torch. For the Digits datasets, the implementation details follows (Long et al., 2018). Thus, the backbone network is Le Net (Le Cun et al., 1998). The main classifier (ˆh) and auxiliary classifier (ˆh ) are both 2 linear layers with Re LU non-linearities and Dropout (0.5) in the last layer. For the NLP task, we follow the standard protocol from Courty et al. (2017); Ganin et al. (2016) and use a simple 2-layer model with sigmoid activation function. For the visual datasets, we use Res Net-50 (He et al., 2016) pretrained on Image Net (Deng et al., 2009) as the backbone network. The main classifier (ˆh) and auxiliary classifier (ˆh ) are both 2 layers neural nets with Leaky-Re LU activation functions. We use spectral normalization (SN) as in (Miyato et al., 2018) only for these two (i.e ˆh and ˆh ). For the Digits datasets, we set batch size=128, learning rate for feature extractor g to 0.01, learning rate for classifiers ˆh and ˆh to 0.001. We train for 15000 iterations for Digits and NLP tasks, and 50000 iterations for Visual tasks. For both Digits and NLP tasks, we use a momentum of 0.9 and a weight decay of 0.0005. For visual tasks, the momentum is 0.9, weight decay is 0.0001. |