Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE
Authors: Zeren Chen, Ziqin Wang, Zhen Wang, Huayang Liu, Zhenfei Yin, Si Liu, Lu Sheng, Wanli Ouyang, Jing Shao
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results (about 20% improvement) have shown the effectiveness and versatility of our design in various 2D and 3D downstream tasks. |
| Researcher Affiliation | Collaboration | 1 Shanghai AI Laboratory, 2 School of Software, Beihang University, 3 Institute of Artificial Intelligence, Beihang University, 4 University of Sydney |
| Pseudocode | No | The paper describes methods in prose and diagrams but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and datasets are available at https://openlamm.github.io/paper list/Octavius. |
| Open Datasets | Yes | Code and datasets are available at https://openlamm.github.io/paper list/Octavius. |
| Dataset Splits | No | The paper uses several well-known datasets but does not explicitly provide the training/validation/test splits for all of them needed for reproduction. For instance, it mentions 'fine-tune Octavius' and 'zero-shot evaluation' on various datasets but does not detail the splits used for training or validation across all datasets. |
| Hardware Specification | Yes | All experiments are conducted on 4 NVIDIA A100 80GB GPUs. |
| Software Dependencies | No | The paper mentions software components like Vicuna-13B, SentencePiece, and Adam optimizer, but does not provide specific version numbers for the underlying software libraries (e.g., PyTorch, TensorFlow, or SentencePiece library version) needed for reproducible setup. |
| Experiment Setup | Yes | The number of experts in the above three setups is 4, 3, and 6, respectively. The rank of each Lo RA expert is set to 32. During fine-tuning, we use an Adam (Kingma & Ba, 2014) optimizer with a total batch size of 64, a learning rate of 5 10 4, and an epoch of 4 on all setups. |