(Almost) Provable Error Bounds Under Distribution Shift via Disagreement Discrepancy
Authors: Elan Rosenfeld, Saurabh Garg
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across numerous vision datasets (BREEDs [65], FMo W-WILDs [35], Visda [51], Domainnet [53], CIFAR10, CIFAR100 [36] and Office Home [69]) demonstrate the effectiveness of our bound. |
| Researcher Affiliation | Academia | Elan Rosenfeld Machine Learning Department Carnegie Mellon University elan@cmu.edu Saurabh Garg Machine Learning Department Carnegie Mellon University |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | We include the model with our publicly available repository. |
| Open Datasets | Yes | We conduct experiments across 11 vision benchmark datasets for distribution shift on datasets that span applications in object classification, satellite imagery, and medicine. We use four BREEDs datasets: [65] Entity13, Entity30, Nonliving26, and Living17; FMo W [11] and Camelyon [4] from WILDS [35]; Officehome [69]; Visda [52, 51]; CIFAR10, CIFAR100 [36]; and Domainet [53]. |
| Dataset Splits | Yes | We use source hold-out performance to pick the best hyperparameters for the UDA methods, since we lack labeled validation data from the target distribution. For all methods, we implement post-hoc calibration on validation source data with temperature scaling [25], which has been shown to improve performance. We use the original train as source and OOD val and OOD test splits as target domains as they are collected over different time-period. Overall, we obtain 3 different domains. |
| Hardware Specification | Yes | Our experiments were performed across a combination of Nvidia T4, A6000, and V100 GPUs. |
| Software Dependencies | No | The paper mentions using 'the standard pytorch implementation [19]' but does not provide specific version numbers for PyTorch or any other software dependencies. It also mentions 'Transfer Learning Library [31]' but without a version. |
| Experiment Setup | Yes | First, we tune learning rate and ℓ2 regularization parameter by fixing batch size for each dataset that correspond to maximum we can fit to 15GB GPU memory. We set the number of epochs for training as per the suggestions of the authors of respective benchmarks. We summarize learning rate, batch size, number of epochs, and ℓ2 regularization parameter used in our study in Table A.3. |