reproducibilityindex.ai

Towards Inference Efficient Deep Ensemble Learning

Authors: Ziyue Li, Kan Ren, Yifan Yang, Xinyang Jiang, Yuqing Yang, Dongsheng Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiments with different backbones on real-world datasets illustrate our method can bring up to 56% inference cost reduction while maintaining comparable performance to full ensemble, achieving significantly better ensemble utility than other baselines. Code and supplemental materials are available at https://seqml.github.io/irene. Experiment Experimental Setup Here we present the details of experimental setup, including datasets, backbones used, and baselines for comparison.
Researcher Affiliation	Industry	Microsoft Research litzy0619owned@gmail.com, kan.ren@microsoft.com
Pseudocode	Yes	The overall training algorithm has been illustrated in Appendix B.
Open Source Code	Yes	Code and supplemental materials are available at https://seqml.github.io/irene.
Open Datasets	Yes	We conduct experiments on two image classification datasets, CIFAR-10 and CIFAR-100, the primary focus of neural ensemble methods (Zhang, Liu, and Yan 2020; Rame and Cord 2021). CIFAR (Krizhevsky, Hinton et al. 2009) contains 50,000 training samples and 10,000 test samples, which are labeled as 10 and 100 classes in CIFAR-10 and CIFAR-100, respectively.
Dataset Splits	No	The paper mentions 50,000 training samples and 10,000 test samples for CIFAR datasets, but does not explicitly provide details about a validation split or how it was derived.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or cloud instances) are mentioned for running the experiments.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow) are provided.
Experiment Setup	No	The paper mentions datasets, backbones (ResNet-32 and ResNet-18), and that ensemble methods use three base models (T=3), but it does not provide specific hyperparameters like learning rates, batch sizes, number of epochs, optimizer settings, or the values for the loss weights (ω1, ω2, ω3).