Characterizing Out-of-Distribution Error via Optimal Transport

Authors: Yuzhe Lu, Yilong Qin, Runtian Zhai, Andrew Shen, Ketong Chen, Zhenlin Wang, Soheil Kolouri, Simon Stepputtis, Joseph Campbell, Katia Sycara

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate COT and COTT on a variety of standard benchmarks that induce various types of distribution shift synthetic, novel subpopulation, and natural and show that our approaches significantly outperform existing state-of-the-art methods with up to 3x lower prediction errors.
Researcher Affiliation Academia Yuzhe Lu 1, Yilong Qin 1, Runtian Zhai1, Andrew Shen1, Ketong Chen1 Zhenlin Wang1, Soheil Kolouri2, Simon Stepputtis1, Joseph Campbell1, Katia Sycara1 1Carnegie Mellon University 2Vanderbilt University
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code can be found at https://github.com/luyuzhe111/COT.
Open Datasets Yes We evaluate COT and COTT on a variety of standard benchmarks that induce various types of distribution shift synthetic, novel subpopulation, and natural and show that our approaches significantly outperform existing state-of-the-art methods with up to 3x lower prediction errors. (followed by list of datasets like CIFAR10, CIFAR100, Image Net, BREEDS, and WILDS benchmarks which are standard publicly available datasets)
Dataset Splits Yes For datasets without an official validation set, we randomly sampled a subset of the official training set as the validation set to perform calibration and learn thresholds for ATC and COTT. We reserved 10000 images from the training set as the validation set.
Hardware Specification Yes We performed training in Py Torch [31], and we used RTX 6000 Ada GPUs.
Software Dependencies No The paper mentions PyTorch but does not specify a version number for it or any other software libraries used.
Experiment Setup Yes We trained Res Net18 from scratch, using SGD with momentum equal to 0.9 for 300 epochs. We set weight decay to 5 10 4 and batch size to 200. We set the initial learning rate to 0.1 and multiply it by 0.1 every 100 epochs.