Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Differentiable Model Selection for Ensemble Learning
Authors: James Kotary, Vincenzo Di Vito, Ferdinando Fioretto
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The e2e-CEL training is evaluated on several vision classification tasks: digit classification on MNIST dataset [Deng, 2012], age-range estimation on UTKFace dataset [Zhifei Zhang, 2017], image classification on CIFAR10 dataset [Krizhevsky et al., 2009], and emotion detection on FER2013 dataset [Liu et al., 2016]. Table 2 reports the best accuracy over all the ensemble sizes k of ensembles trained by e2e-CEL along with that of each baseline ensemble model, where each are formed using the same pre-trained base learners. |
| Researcher Affiliation | Academia | James Kotary1 , Vincenzo Di Vito1 and Ferdinando Fioretto1 1 University of Virginia EMAIL, fioretto@virginia.edu |
| Pseudocode | Yes | Algorithm 1 summarizes the e2e-CEL procedure for training a selection net. Algorithm 1: Training the Selection Net |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for the described methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | digit classification on MNIST dataset [Deng, 2012], age-range estimation on UTKFace dataset [Zhifei Zhang, 2017], image classification on CIFAR10 dataset [Krizhevsky et al., 2009], and emotion detection on FER2013 dataset [Liu et al., 2016]. |
| Dataset Splits | Yes | In each dataset there is an implied train/test/validation split, so that evaluation of a trained model is always performed on its test portion. Where this distinction is needed, the symbols Xtrain, Xvalid, Xtest are used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments, only stating that the selection net uses the 'same CNN architecture as that of the base learner models'. |
| Software Dependencies | No | The paper refers to 'standard automatic differentiation employed in machine learning libraries [Paszke et al., 2019]' (which cites PyTorch), but no specific version numbers for any software dependencies or libraries are provided. |
| Experiment Setup | No | The paper describes the general approach to training base learners (e.g., specializing on classes) and the selection net's architecture, and mentions `alpha` in Algorithm 1, but it does not provide specific numerical hyperparameter values such as learning rate, batch size, or number of epochs for the main experimental setup. |