reproducibilityindex.ai

Towards Modular LLMs by Building and Reusing a Library of LoRAs

Authors: Oleksiy Ostapenko, Zhan Su, Edoardo Ponti, Laurent Charlin, Nicolas Le Roux, Lucas Caccia, Alessandro Sordoni

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experiment with several LLMs, such as Phi-2 and Mistral, on a wide array of held-out tasks, verifying that MBC-based adapters and Arrow routing lead to superior generalization to new tasks. Thus, we make steps towards creating modular, adaptable LLMs that can match or outperform traditional joint training.
Researcher Affiliation	Collaboration	1Microsoft Research 2Mila Quebec AI Institute 3Universit e de Montr eal 4University of Copenhagen 5University of Edinburgh 6HEC Montr eal 7Canada CIFAR AI Chair.
Pseudocode	Yes	Algorithm 1 Model-Based Clustering (MBC)... Algorithm 2 Arrow Routing
Open Source Code	No	The paper states: "We acknowledge the support of Matheus Pereira for maintaining and optimizing the code, as well as for preparing the code release." However, it does not provide concrete access, such as a specific repository link, nor does it explicitly state that the code is publicly available at the time of publication.
Open Datasets	Yes	We train expert modules on 256 tasks from the original Flan v2 dataset (Longpre et al., 2023).
Dataset Splits	Yes	We threshold the number of training examples to 10,000 examples per task and reserve 1,000 for validation.
Hardware Specification	No	The paper mentions the LLMs used (Phi-2 and Mistral 7B) but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for training or inference.
Software Dependencies	No	The paper mentions the use of specific LLM models (Phi-2, Mistral) and parameter-efficient fine-tuning methods like LoRA. While it references the PEFT library in its bibliography, it does not provide specific version numbers for Python, PyTorch, CUDA, or other key software components used in the experiments.
Experiment Setup	Yes	Unless stated otherwise, for all our multi-task training and single-task adaptation scenarios, we use Lo RA rank of 4, dropout of 0.05 and learning rate of 1e-4. Unless specified, we set the number of clusters for MBC to 10, resulting in the best upstream validation loss and downstream performance for Phi-2, as demonstrated in Fig. 4.