Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems

Authors: Giung Nam, Juho Lee

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical analysis demonstrates the effectiveness of our proposed low precision ensembling method compared to existing ensemble approaches.
Researcher Affiliation Academia Giung Nam Kim Jaechul Graduate School of AI KAIST, Daejeon, South Korea giung@kaist.ac.kr Juho Lee Kim Jaechul Graduate School of AI KAIST, Daejeon, South Korea juholee@kaist.ac.kr
Pseudocode No The paper does not contain explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/cs-giung/lpe-bsr.
Open Datasets Yes We employed two datasets for our experiments: Image Net (Russakovsky et al., 2015) for Vi T and CLIP-Vi T models, and MMLU (Hendrycks et al., 2021) for LLa Ma.
Dataset Splits Yes Table 1 summarizes the evaluation results on a subset of the Image Net validation split, along with the parameter count for each model.
Hardware Specification Yes We conducted experiments using TPUv2/v3/v4 cores, with flexibility in selecting the cores based on the memory requirements of each experiment.
Software Dependencies No We built our experimental code using JAX (Bradbury et al., 2018) and Transformers (Wolf et al., 2020), both licensed under Apache-2.0.
Experiment Setup Yes The optimization process for CLIP-Vi T-L/14 models concludes after 100,000 iterations with a minibatch size of 64, employing a cosine decaying learning rate schedule.