reproducibilityindex.ai

EMR-Merging: Tuning-Free High-Performance Model Merging

Authors: Chenyu Huang, Peng Ye, Tao Chen, Tong He, Xiangyu Yue, Wanli Ouyang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We find that EMR-MERGING shows outstanding performance compared to existing merging methods under different classical and newly-established settings, including merging different numbers of vision models (up to 30), NLP models, PEFT models, and multi-modal models.
Researcher Affiliation	Collaboration	Chenyu Huang1 , Peng Ye1,3 , Tao Chen1 , Tong He2, Xiangyu Yue3, Wanli Ouyang3 1 Fudan University 2 Shanghai AI Laboratory 3 The Chinese University of Hong Kong
Pseudocode	Yes	We summarize the procedure of EMR-MERGING in Algorithm 1. Algorithm 1 EMR-MERGING Procedure
Open Source Code	Yes	Our code is available at https://github.com/harveyhuang18/EMR_Merging.
Open Datasets	Yes	We employ Vi T-B/32 and Vi T-L/14, two variants of CLIP [54] models visual encoders, as the pre-trained models. The performance of each method is evaluated by eight image classification tasks, including SUN397 [83], Cars [35], RESISC45 [10], Euro SAT [27], SVHN [91], GTSRB [65], MNIST [38], and DTD [11].
Dataset Splits	No	While the paper refers to 'validation data' (e.g., in Table 1 and Table 7), it does not explicitly provide the specific training/validation/test split percentages or sample counts for the datasets used in its experiments. It mentions following settings from other papers but does not detail the splits here.
Hardware Specification	No	The paper mentions that 'The computations in this research were performed using the CFFF platform of Fudan University' in the Acknowledgement section. However, this is a general platform reference and does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies	No	The paper refers to software libraries like Huggingface [79], timm [77], and torchvision [44] through citations. However, it does not provide specific version numbers for these or any other ancillary software components needed to replicate the experiments.
Experiment Setup	No	The paper states 'We follow the setting from Task Arithmetic [30], Ties-Merging [84], and Ada Merging [85]' (Section 4.1.1) and provides model details (e.g., Vi T-B/32, Ro BERTa-base). However, it does not explicitly list specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations for its experiments.