Domain Generalization using Causal Matching
Authors: Divyat Mahajan, Shruti Tople, Amit Sharma
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our matching-based methods on rotated MNIST and Fashion-MNIST, PACS and Chest X-ray datasets. On all datasets, the simple methods Match DG and MDGHybrid are competitive to state-of-the-art methods for out-of-domain accuracy. |
| Researcher Affiliation | Industry | 1Microsoft Research, India 2Microsoft Research, UK. |
| Pseudocode | Yes | Algorithm 1 Match DG |
| Open Source Code | Yes | Also, the code repository can be accessed at: https://github.com/microsoft/robustdg |
| Open Datasets | Yes | We evaluate our matching-based methods on rotated MNIST and Fashion-MNIST, PACS and Chest X-ray datasets. For PACS dataset (Li et al., 2017), ... Chest X-rays. We introduce a harder real-world dataset based on Chest X-ray images from three different sources: NIH (Wang et al., 2017), Chex Pert (Irvin et al., 2019) and RSNA (rsn, 2018). |
| Dataset Splits | Yes | While using a validation set from the test domain may improve classification accuracy, it goes against the problem motivation of generalization to unseen domains. Hence, we use only data from source domains to construct a validation set (except when explicitly mentioned in Table 4, to compare to past methods that use test domain validation). |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU/CPU models, processors, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions using implementations from "Domain Bed" and various neural network architectures (Resnet-18, Resnet-50, Alexnet) but does not provide specific version numbers for any software dependencies like PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | To find matches, we optimize a contrastive representation learning loss that minimizes distance between same-class inputs from different domains in comparison to inputs from different classes across domains. Adapting the contrastive loss for a single domain (Chen et al., 2020), we consider positive matches as two inputs with the same class but different domains, and negative matches as pairs with different classes. For every positive match pair (xj, xk), we propose a loss where τ is a hyperparameter, B is the batch size, and sim(a, b) = Φ(xa)T Φ(xb)/ Φ(xa) Φ(xb) is the cosine similarity. ... we update the positive matches based on the nearest same-class pairs in representation space and iterate until convergence. Hence for each anchor point, starting with an initial set of positive matches, in each epoch a representation is learnt using contrastive learning; after which the positive matches are themselves updated based on the closest same-class data points across domains in the representation. ... For all matching-based methods, we use the cross-entropy loss for Ld and ℓ2 distance for dist in Eq.(3). |