reproducibilityindex.ai

Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting

Authors: Anthony Chen, Huanrui Yang, Yulu Gan, Denis A Gudovskiy, Zhen Dong, Haofan Wang, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, Shanghang Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical study shows Split Ensemble, without additional computational cost, improves accuracy over a single model by 0.8%, 1.8%, and 25.5% on CIFAR-10, CIFAR-100, and Tiny-Image Net, respectively. OOD detection for the same backbone and in-distribution datasets surpasses a single model baseline by 2.2%, 8.1%, and 29.6% in mean AUROC, respectively.
Researcher Affiliation	Collaboration	1School of Computer Science, Peking University 2University of California, Berkeley 3Panasonic Holdings Corporation 4Carnegie Mellon University.
Pseudocode	Yes	The detailed process of Split-Ensemble training is provided in the pseudo-code in Algorithm 1 of Appendix B.
Open Source Code	No	The paper does not provide a statement about releasing open-source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	Yes	We perform classification tasks on four popular image classification benchmarks, including CIFAR-10, CIFAR-100 (Krizhevsky, 2009), Tiny Image Net (Deng et al., 2009) and Image Net (Krizhevsky et al., 2012) datasets.
Dataset Splits	No	The paper mentions 'test(val) sets' and discusses training and testing phases but does not explicitly provide specific training/validation/test dataset split percentages, absolute sample counts, or explicit references to predefined splits with citations that define these proportions.
Hardware Specification	Yes	Our Split Ensemble model was trained over 200 epochs using a single NVIDIA A100 GPU with 80GB of memory, for experiments involving CIFAR-10, CIFAR-100, and Tiny Image Net datasets. For the larger-scale Image Net dataset, we employ 8 NVIDIA A100 GPUs, each with 80GB memory, to handle the increased computational demands.
Software Dependencies	No	The paper mentions using a library from (Kirchheim et al., 2022) for Gaussian and Uniform Noise generation, which is 'Pytorch-ood', but does not specify the version of PyTorch or any other core software dependencies with version numbers used for their own implementation.
Experiment Setup	Yes	We use an SGD optimizer with a momentum of 0.9 and weight decay of 0.0005. We also adopt a 200-epoch cosine learning rate schedule with 10 warm-up epochs and a batchsize of 256.