Estimating Generalization under Distribution Shifts via Domain-Invariant Representations
Authors: Ching-Yao Chuang, Antonio Torralba, Stefanie Jegelka
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection. |
| Researcher Affiliation | Academia | 1CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA. Correspondence to: Ching-Yao Chuang <cychuang@mit.edu>. |
| Pseudocode | Yes | Algorithm 1 provides details about approximating the proxy risk1. In brief, we first pretrain h = f g , and then maximize the disagreement with h under constraints. |
| Open Source Code | Yes | 1The code is available at https://github.com/chingyaoc/estimating-generalization. |
| Open Datasets | Yes | Empirically, we examine our theory and algorithms on sentiment analysis (Amazon review dataset), digit classification (MNIST, MNIST-M, SVHN) and general object classification (Office-31). |
| Dataset Splits | No | A validation set from the source domain is used as an early stopping criterion during learning. However, specific details about its size, proportions, or how it was derived (e.g., exact percentages or sample counts) are not provided. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., specific programming languages, libraries, or frameworks with their respective versions). |
| Experiment Setup | No | The paper describes how architectural parameters (like number of layers) were varied during experiments and mentions using a 'progressive training strategy for the discriminator', but it does not provide specific hyperparameters such as learning rates, batch sizes, optimizers, or training epochs. |