reproducibilityindex.ai

Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters

Authors: Yuhang Zhou, Zihua Zhao, Siyuan Du, Haolin Li, Jiangchao Yao, Ya Zhang, Yanfeng Wang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments to verify the superiority of Mo LA over previous state-of-the-art methods and present indepth analysis on its working mechanism.
Researcher Affiliation	Academia	1Cooperative Medianet Innovation Center, Shanghai Jiao Tong University 2Shanghai Artificial Intelligence Laboratory 3Fudan University.
Pseudocode	No	The paper describes methods and processes in text and equations but does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Source code is available at: https://github.com/MediaBrain-SJTU/MoLA
Open Datasets	Yes	For domain heterogeneity, we use VLCS (Torralba & Efros, 2011) and Officehome (Venkateswara et al., 2017) datasets; For Multi-input task heterogeneity, we use Rad Image Net (Mei et al., 2022) and Med MNISTV2 (Yang et al., 2021) datasets; For Single-input task heterogeneity, we use NYUv2 (Silberman et al., 2012).
Dataset Splits	No	The paper mentions training and testing but does not explicitly provide specific details or proportions for validation dataset splits for all experiments.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies	No	The paper mentions using AdamW, SGD, and Adam optimizers with citations but does not specify software dependencies with version numbers like PyTorch or Python versions.
Experiment Setup	Yes	We apply the Adam W optimizer (...) with learning rate of 0.0001 for experiments on VLCS and Officehome datasets, SGD optimizer (...) with learning rate of 0.05 on Rad Image Net and Med MNIST datasets and Adam optimizer (...) with learning rate of 0.0001 on NYUv2 dataset. For all of the experiments, the training batch-size is set to 128.