Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
LLMs Can Evolve Continually on Modality for $\mathbb{X}$-Modal Reasoning
Authors: Jiazuo Yu, Haomiao Xiong, Lu Zhang, Haiwen Diao, Yunzhi Zhuge, Lanqing Hong, Dong Wang, Huchuan Lu, You He, Long Chen
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of the proposed An A framework on learning plasticity and memory stability during continual learning. |
| Researcher Affiliation | Collaboration | 1Dalian University of Technology, 2Huawei Noah s Ark Lab 3Tsinghua University, 4The Hong Kong University of Science and Technology |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code locates at https://github.com/Jiazuo Yu/Path Weave. |
| Open Datasets | Yes | We establish a challenging benchmark, Continual Learning on Modality (MCL), which consists of multimodal high-quality QA data to evaluate the effectiveness of our method on continual uni-modal finetuning. These datasets are collected from five distinct modalities: image, video, depth, audio and point cloud. More details of the dataset list and size for each modality are illustrated in Table A6 of the Appendix. |
| Dataset Splits | Yes | Table A7 records the detailed hyper-parameters we used during the training and testing process... Modality Iteration Batch Size (Train/Val) Learning Rate |
| Hardware Specification | Yes | We optimize our model on 4 A800 GPUs (80GB) using Adam W [53] with β1 = 0.9, β2 = 0.999, and a weight decay of 0.05. |
| Software Dependencies | No | The paper states, 'Our method is built on the LAVIS library s framework [52] atop the Vicuna v1.1 7b [3].' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We optimize our model on 4 A800 GPUs (80GB) using Adam W [53] with β1 = 0.9, β2 = 0.999, and a weight decay of 0.05. ... Table A7 records the detailed hyper-parameters we used during the training and testing process. ... We keep all the learning rate decrease from 1e-5 and cosine annealing strategy with 0.5 decay weight. The warm-up phase starts from 1e-8 and lasts for 1000 iterations for all modality training. |