Characterizing Out-of-Distribution Error via Optimal Transport
Authors: Yuzhe Lu, Yilong Qin, Runtian Zhai, Andrew Shen, Ketong Chen, Zhenlin Wang, Soheil Kolouri, Simon Stepputtis, Joseph Campbell, Katia Sycara
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate COT and COTT on a variety of standard benchmarks that induce various types of distribution shift synthetic, novel subpopulation, and natural and show that our approaches significantly outperform existing state-of-the-art methods with up to 3x lower prediction errors. |
| Researcher Affiliation | Academia | Yuzhe Lu 1, Yilong Qin 1, Runtian Zhai1, Andrew Shen1, Ketong Chen1 Zhenlin Wang1, Soheil Kolouri2, Simon Stepputtis1, Joseph Campbell1, Katia Sycara1 1Carnegie Mellon University 2Vanderbilt University |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code can be found at https://github.com/luyuzhe111/COT. |
| Open Datasets | Yes | We evaluate COT and COTT on a variety of standard benchmarks that induce various types of distribution shift synthetic, novel subpopulation, and natural and show that our approaches significantly outperform existing state-of-the-art methods with up to 3x lower prediction errors. (followed by list of datasets like CIFAR10, CIFAR100, Image Net, BREEDS, and WILDS benchmarks which are standard publicly available datasets) |
| Dataset Splits | Yes | For datasets without an official validation set, we randomly sampled a subset of the official training set as the validation set to perform calibration and learn thresholds for ATC and COTT. We reserved 10000 images from the training set as the validation set. |
| Hardware Specification | Yes | We performed training in Py Torch [31], and we used RTX 6000 Ada GPUs. |
| Software Dependencies | No | The paper mentions PyTorch but does not specify a version number for it or any other software libraries used. |
| Experiment Setup | Yes | We trained Res Net18 from scratch, using SGD with momentum equal to 0.9 for 300 epochs. We set weight decay to 5 10 4 and batch size to 200. We set the initial learning rate to 0.1 and multiply it by 0.1 every 100 epochs. |