Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Wisdom of the Ensemble: Improving Consistency of Deep Learning Models
Authors: Lijing Wang, Dipanjan Ghosh, Maria Gonzalez Diaz, Ahmed Farahat, Mahbubul Alam, Chetan Gupta, Jiangzhuo Chen, Madhav Marathe
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate the theory using three datasets and two state-of-the-art deep learning classifiers we also propose an efficient dynamic snapshot ensemble method and demonstrate its value. Code for our algorithm is available at https://github.com/christa60/dynens. |
| Researcher Affiliation | Collaboration | Lijing Wang University of Virignia EMAIL Dipanjan Ghosh Hitachi America Ltd. EMAIL Maria Teresa Gonzalez Diaz Hitachi America Ltd. EMAIL Ahmed Farahat Hitachi America Ltd. EMAIL Mahbubul Alam Hitachi America Ltd. EMAIL Chetan Gupta Hitachi America Ltd. EMAIL Jiangzhuo Chen University of Virignia EMAIL Madhav Marathe University of Virignia EMAIL |
| Pseudocode | Yes | Algorithm 1: Pseudocode of the dynamic snapshot ensemble (Dyn Snap) |
| Open Source Code | Yes | Code for our algorithm is available at https://github.com/christa60/dynens. |
| Open Datasets | Yes | We conduct experiments using three datasets and two state-of-the-art models. YAHOO!Answers [36] is a topic classification dataset with 10 output categories, 140K and 6K training and testing samples. CIFAR10 and CIFAR100 [23] are datasets with 10 and 100 output categories respectively, 50k and 10k color images as training and testing samples. |
| Dataset Splits | Yes | The dataset, models and hyper-parameters are shown in Table 1. Table 1: Data and Models ... Training ... Validation ... Testing |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not list specific software components with their version numbers required for reproducibility. |
| Experiment Setup | Yes | The experiment settings for Single Base models are shown in Table 1. We set m = 20 for ensemble methods, and N = 10, β = β for Dyn Snap-cyc and Dyn Snap-step, Fd(t) in Dyn Snap-step is 1e 1, 1e 2, 1e 3 at 80, 120, 160 epochs, dropout with 0.1 drop probability. |