In Search of Lost Domain Generalization

Authors: Ishaan Gulrajani, David Lopez-Paz

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental When conducting extensive experiments using DOMAINBED we find that when carefully implemented and tuned, ERM outperforms the state-of-the-art in terms of average performance.
Researcher Affiliation Collaboration Ishaan Gulrajani Stanford University igul222@gmail.com David Lopez-Paz Facebook AI Research dlp@fb.com
Pseudocode Yes Appendix B.5 EXTENDING DOMAINBED... For example, to implement group DRO (Sagawa et al., 2019, Algorithm 1), we simply write the following in algorithms.py: class Group DRO(ERM): def __init__(self, input_shape, num_classes, num_domains, hparams): super().__init__(input_shape, num_classes, num_domains, hparams) self.register_buffer("q", torch.Tensor()) def update(self, minibatches): device = "cuda" if minibatches[0][0].is_cuda else "cpu" if not len(self.q): self.q = torch.ones(len(minibatches)).to(device) losses = torch.zeros(len(minibatches)).to(device) for m in range(len(minibatches)): x, y = minibatches[m] losses[m] = F.cross_entropy(self.predict(x), y) self.q[m] *= (self.hparams["dro_eta"] * losses[m].data).exp() self.q /= self.q.sum() loss = torch.dot(losses, self.q) / len(minibatches) self.optimizer.zero_grad() loss.backward() self.optimizer.step() return { loss : loss.item()}
Open Source Code Yes As a result of our research, we release DOMAINBED, a framework to streamline rigorous and reproducible experimentation in DG: https://github.com/facebookresearch/DomainBed/.
Open Datasets Yes DOMAINBED currently includes downloaders and loaders for seven standard DG image classification benchmarks. These are Colored MNIST (Arjovsky et al., 2019), Rotated MNIST (Ghifary et al., 2015), PACS (Li et al., 2017), VLCS (Fang et al., 2013), Office Home (Venkateswara et al., 2017), Terra Incognita (Beery et al., 2018), and Domain Net (Peng et al., 2019).
Dataset Splits Yes We split the data from each domain into 80% and 20% splits. We use the larger splits for training and final evaluation, and the smaller splits to select hyperparameters (for an illustration, see Appendix B.3).
Hardware Specification No No specific hardware details (like GPU/CPU models, memory, or specific cloud instances) were mentioned for running the experiments.
Software Dependencies No The paper mentions 'Py Torch' but does not specify a version number for it or any other key software dependencies.
Experiment Setup Yes Table 6 lists all algorithm hyperparameters, their default values, and their sweep random search distribution. We optimize all models using Adam (Kingma and Ba, 2015).