reproducibilityindex.ai

Mixture of LoRA Experts

Authors: Xun Wu, Shaohan Huang, Furu Wei

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted in both Natural Language Processing (NLP) and Vision & Language (V&L) domains validate the effects of MOLE.
Researcher Affiliation	Collaboration	1Microsoft Research Asia 2Tsinghua Univeristy
Pseudocode	No	The paper does not contain any sections explicitly labeled 'Pseudocode' or 'Algorithm', nor are there any structured code-like blocks describing a procedure.
Open Source Code	Yes	Our code are available at https://github.com/yushuiwx/Mo LE.git.
Open Datasets	Yes	We conducted extensive experiments across various tasks, including Translation, Natural Language Inference (NLI), Struct to Text, Closed-Book QA, and multiple subtasks within the Big-Bench Hard (BBH) (Ghazal et al., 2013) dataset. We trained a single Lo RA on a combined dataset comprising ANLI-R1 (Nie et al., 2019), ANLI-R2 (Nie et al., 2019), and QNLI (Rajpurkar et al., 2018) datasets, as depicted in Table 5.
Dataset Splits	No	The paper describes training parameters such as learning rate, batch size, and iterations, and mentions 'test' sets for evaluation, but it does not explicitly specify a 'validation set' or a dedicated 'validation split' with specific percentages or counts for data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running experiments.
Software Dependencies	No	The paper mentions models and frameworks like 'Stable Diffusion V2.1' and 'Flan-T5', but it does not list any specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	During training MOLE, we process the image resolution to 512 × 512 and set learning rate as 1e-5. We use DDPM sampler (Ho et al., 2020) with 50 steps in each case and train 400 iterations for each required composition with batch size 2 and α as 0.5.