The Value of Out-of-Distribution Data
Authors: Ashwin De Silva, Rahul Ramesh, Carey Priebe, Pratik Chaudhari, Joshua T Vogelstein
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We analytically demonstrate these results via Fisher s Linear Discriminant on synthetic datasets, and empirically demonstrate them via deep networks on computer vision benchmarks such as MNIST, CIFAR-10, CINIC-10, PACS and Domain Net. |
| Researcher Affiliation | Academia | 1Johns Hopkins University 2University of Pennsylvania. |
| Pseudocode | No | The paper contains detailed analytical derivations for Fisher's Linear Discriminant (FLD) in Appendix A, including equations for generalization error and weighted FLD, but it does not present any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/neurodata/value-of-ood-data. |
| Open Datasets | Yes | We experiment with several popular datasets including MNIST, CIFAR-10, PACS, and Domain Net and 3 different network architectures: (a) a small convolutional network with 0.12M parameters (denoted by Small Conv), (b) a wide residual network (Zagoruyko & Komodakis, 2016) of depth 10 and widening factor 2 (WRN-10-2), and (c) a larger wide residual network of depth 16 and widening factor 4 (WRN-16-4). See Appendix B.4 for more details. |
| Dataset Splits | Yes | For each value of m, we perform hyper-parameter tuning using Ray (Liaw et al., 2018) over a validation set that has only target samples, and record the target generalization error of the model using the best set of hyper-parameters. |
| Hardware Specification | No | ADS and JTV were supported by the NSF AI Institute Planning award (#2020312), NSF-Simons Research Collaborations on the Mathematical and Scientific Foundations of Deep Learning (Mo DL) and THEORINET. RR and PC were supported by grants from the National Science Foundation (IIS-2145164, CCF-2212519), Office of Naval Research (N00014-22-1-2255), and cloud computing credits from Amazon Web Services. |
| Software Dependencies | No | For each value of m, we perform hyper-parameter tuning using Ray (Liaw et al., 2018)... |
| Experiment Setup | Yes | The hyperparameters used for the training are, learning rate of 0.01, and a weight-decay of 10-5. All the networks are trained using stochastic gradient descent (SGD) with Nesterov s momentum and cosine-annealed learning rate. |