Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
Authors: Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Shanghang Zhang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical study shows Split Ensemble, without additional computational cost, improves accuracy over a single model by 0.8%, 1.8%, and 25.5% on CIFAR-10, CIFAR-100, and Tiny-Image Net, respectively. OOD detection for the same backbone and in-distribution datasets surpasses a single model baseline by 2.2%, 8.1%, and 29.6% in mean AUROC, respectively. |
| Researcher Affiliation | Collaboration | 1School of Computer Science, Peking University 2University of California, Berkeley 3Panasonic Holdings Corporation 4Carnegie Mellon University. |
| Pseudocode | Yes | The detailed process of Split-Ensemble training is provided in the pseudo-code in Algorithm 1 of Appendix B. |
| Open Source Code | No | The paper does not provide a statement about releasing open-source code for the methodology described, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We perform classification tasks on four popular image classification benchmarks, including CIFAR-10, CIFAR-100 (Krizhevsky, 2009), Tiny Image Net (Deng et al., 2009) and Image Net (Krizhevsky et al., 2012) datasets. |
| Dataset Splits | No | The paper mentions 'test(val) sets' and discusses training and testing phases but does not explicitly provide specific training/validation/test dataset split percentages, absolute sample counts, or explicit references to predefined splits with citations that define these proportions. |
| Hardware Specification | Yes | Our Split Ensemble model was trained over 200 epochs using a single NVIDIA A100 GPU with 80GB of memory, for experiments involving CIFAR-10, CIFAR-100, and Tiny Image Net datasets. For the larger-scale Image Net dataset, we employ 8 NVIDIA A100 GPUs, each with 80GB memory, to handle the increased computational demands. |
| Software Dependencies | No | The paper mentions using a library from (Kirchheim et al., 2022) for Gaussian and Uniform Noise generation, which is 'Pytorch-ood', but does not specify the version of PyTorch or any other core software dependencies with version numbers used for their own implementation. |
| Experiment Setup | Yes | We use an SGD optimizer with a momentum of 0.9 and weight decay of 0.0005. We also adopt a 200-epoch cosine learning rate schedule with 10 warm-up epochs and a batchsize of 256. |