Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
EMR-Merging: Tuning-Free High-Performance Model Merging
Authors: Chenyu Huang, Peng Ye, Tao Chen, Tong He, Xiangyu Yue, Wanli Ouyang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We find that EMR-MERGING shows outstanding performance compared to existing merging methods under different classical and newly-established settings, including merging different numbers of vision models (up to 30), NLP models, PEFT models, and multi-modal models. |
| Researcher Affiliation | Collaboration | Chenyu Huang1 , Peng Ye1,3 , Tao Chen1 , Tong He2, Xiangyu Yue3, Wanli Ouyang3 1 Fudan University 2 Shanghai AI Laboratory 3 The Chinese University of Hong Kong |
| Pseudocode | Yes | We summarize the procedure of EMR-MERGING in Algorithm 1. Algorithm 1 EMR-MERGING Procedure |
| Open Source Code | Yes | Our code is available at https://github.com/harveyhuang18/EMR_Merging. |
| Open Datasets | Yes | We employ Vi T-B/32 and Vi T-L/14, two variants of CLIP [54] models visual encoders, as the pre-trained models. The performance of each method is evaluated by eight image classification tasks, including SUN397 [83], Cars [35], RESISC45 [10], Euro SAT [27], SVHN [91], GTSRB [65], MNIST [38], and DTD [11]. |
| Dataset Splits | No | While the paper refers to 'validation data' (e.g., in Table 1 and Table 7), it does not explicitly provide the specific training/validation/test split percentages or sample counts for the datasets used in its experiments. It mentions following settings from other papers but does not detail the splits here. |
| Hardware Specification | No | The paper mentions that 'The computations in this research were performed using the CFFF platform of Fudan University' in the Acknowledgement section. However, this is a general platform reference and does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper refers to software libraries like Huggingface [79], timm [77], and torchvision [44] through citations. However, it does not provide specific version numbers for these or any other ancillary software components needed to replicate the experiments. |
| Experiment Setup | No | The paper states 'We follow the setting from Task Arithmetic [30], Ties-Merging [84], and Ada Merging [85]' (Section 4.1.1) and provides model details (e.g., Vi T-B/32, Ro BERTa-base). However, it does not explicitly list specific hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations for its experiments. |