Diverse Ensemble Evolution: Curriculum Data-Model Marriage
Authors: Tianyi Zhou, Shengjie Wang, Jeff A. Bilmes
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, Div E2 outperforms other ensemble training methods under a variety of model aggregation techniques, while also maintaining competitive efficiency. We apply Div E2 to four benchmark datasets, and show that it improves over randomization-based ensemble training methods on a variety of approaches to aggregate ensemble models into a single prediction. |
| Researcher Affiliation | Academia | University of Washington, Seattle {tianyizh, wangsj, bilmes}@uw.edu |
| Pseudocode | Yes | Algorithm 1 SELECTLEARN(k, p, λ, γ, {w0 i }); |
| Open Source Code | No | The paper does not contain an unambiguous statement that the authors are releasing the code for the described methodology, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | Mobile Net V2 [56] on CIFAR10 [38]; (2) Res Net18 [29] on CIFAR100 [38]; (3) CNNs with two convolutional layers4 on Fashion-MNIST ( Fashion in all tables) [69]; (4) and lastly CNNs with six convolutional layers on STL10 [12]5. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly provide specific details about training/validation/test dataset splits (e.g., percentages, sample counts, or explicit cross-validation setup) needed for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We everywhere fix the number of models at m = 10, and use 2 parameter regularization on w with weight 1 10 4. In Div E2 s training phase, we start from k = 6, p = n/2m and linearly change to k = 1, p = 3n/m in T = 200 episodes. |