BERT Loses Patience: Fast and Robust Inference with Early Exit

Authors: Wangchunshu Zhou, Canwen Xu, Tao Ge, Julian McAuley, Ke Xu, Furu Wei

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on the GLUE benchmark and show that PABEE outperforms existing prediction probability distribution-based exit criteria by a large margin.
Researcher Affiliation Collaboration Wangchunshu Zhou1 , Canwen Xu2 , Tao Ge3, Julian Mc Auley2, Ke Xu1, Furu Wei3 1Beihang University 2University of California, San Diego 3Microsoft Research Asia
Pseudocode No The paper describes the inference and training processes using mathematical equations but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes 2Code available at https://github.com/Jet Runner/PABEE.
Open Datasets Yes We evaluate our proposed approach on the GLUE benchmark [35].
Dataset Splits Yes We apply an early stopping mechanism and select the model with the best performance on the development set.
Hardware Specification Yes We conduct our experiments on a single Nvidia V100 16GB GPU.
Software Dependencies No The paper mentions implementing PABEE on 'Hugging Face s Transformers [43]' but does not provide a specific version number for this or any other software dependency.
Experiment Setup Yes We perform grid search over batch sizes of {16, 32, 128}, and learning rates of {1e-5, 2e-5, 3e-5, 5e-5} with an Adam optimizer. We apply an early stopping mechanism and select the model with the best performance on the development set.