reproducibilityindex.ai

Rethinking Fano’s Inequality in Ensemble Learning

Authors: Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, Nobuo Nukaga

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Further, we empirically validate and demonstrate the proposed theory through extensive experiments on actual systems. ... We validate the framework through extensive experiments on DNN ensemble systems.
Researcher Affiliation	Industry	Terufumi Morishita 1 Gaku Morio * 1 Shota Horiguchi * 1 Hiroaki Ozaki 1 Nobuo Nukaga 1 ... 1Hitachi, Ltd. Research and Development Group, Kokubunji, Tokyo, Japan.
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	We release our code as open source.1 1Available at: https://github.com/hitachi-nlp/ ensemble-metrics
Open Datasets	Yes	We used eight classification tasks with moderately-sized datasets for computational reasons: Boolq (Clark et al., 2019), Co LA (Dolan & Brockett, 2005), Cosmos QA (Khot et al., 2018), MNLI (Williams et al., 2018), MRPC (Dolan & Brockett, 2005), Sci Tail (Khot et al., 2018), SST (Socher et al., 2013), and QQP. ... Table E.10: Tasks used in this study.Mmajority of tasks are from GLUE benchmark (Wang et al., 2018) (shown as ) and Super GLUE benchmark (Wang et al., 2019) (shown as ). All datasets are publicly available.
Dataset Splits	Yes	In order to train meta-estimator of Stacking, we must take cross-validation based dataset splitting strategy (Wolpert, 1992)... In this study, we used n = 5... In this study, we used l = 4. ... TValidation sets were used only during the preliminary experiments to adjust some hyperparameters (shown below).
Hardware Specification	Yes	A single run of experiments required about 200 GPUs (V100) 1 day. ... Computational resources of AI Bridging Cloud Infrastructure (ABCI) provided by National Institute of Advanced Industrial Science and Technology (AIST) were used.
Software Dependencies	Yes	We implemented the fine-tuning of DNNs described here using the jiant library (Phang et al., 2020) (v2.2.06), which in turn utilizes Hugging Face s Transformers library (Wolf et al., 2020). ... We implemented the model combination methods in Table 2 using scikit-learn 7. For the training of Stacking metaestimators, we used the hyperparameters shown in Table E.9. ... Most of the hyperparameters are set as default values of scikit-learn (version 0.22.2).
Experiment Setup	Yes	We used the hyperparameters shown in Table E.8 to finetune all of the DNN types. ... Table E.8: Hyperparameters used for fine-tuning of DNNs. hyperparameter value learning rate 3e-5 ([1e-5, 1e-4] for the random sampling of Random-Hy P) optimizer Adam (Kingma & Ba, 2015) (ϵ = 1e 8) with linear warmup (data size proportion=0.1), described in (Devlin et al., 2019). gradient clipping 1.0 gradient accumulation steps 1 epochs 5 dropout DNN specific values (follows jiant (Phang et al., 2020)) training batch size 16 inference batch size 32 number of softmax layer 1. ... Table E.9: Meta-estimator hyperparameters.