Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Modality-Aware SAM: Sharpness-Aware-Minimization Driven Gradient Modulation for Harmonized Multimodal Learning
Authors: Hossein Rajoli Nowdeh, Jie Ji, Xiaolong Ma, Fatemeh Afghah
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four diverse datasets show that M-SAM outperforms the latest state-of-the-art optimization and gradient manipulation methods and significantly balances and improves multimodal learning. |
| Researcher Affiliation | Academia | Hossein Rajoli Holcombe Department of ECE Clemson University EMAIL Jie Ji Holcombe Department of ECE Clemson University EMAIL Xiaolong Ma Department of ECE University of Arizona EMAIL Fatemeh Afghah Holcombe Department of ECE Clemson University EMAIL |
| Pseudocode | Yes | Algorithm 1 M-SAM Algorithm |
| Open Source Code | No | All of our results can be reproduced and we will release our code after the paper is accepted. We will also release our code after the paper is accepted. |
| Open Datasets | Yes | M-SAM is evaluated on three popular multi-modal datasets: AV-MNIST [33], CREMA-D [2], UR-Funny [11], and AVE [31]. Details of these datasets are presented in Appendix. A |
| Dataset Splits | No | The paper mentions training, validation, and test performance, and refers to following prior works for model design and preprocessing, but does not explicitly state the dataset splits used for any of the mentioned datasets. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running its experiments. It only states that the paper provides "sufficient information on the computer resources needed to reproduce the experiments by mentioning the hardware used for execution, such as the type of CPU or GPU" in the NeurIPS checklist, but this information is not present in the paper's main text or appendices. |
| Software Dependencies | No | The paper mentions using "SGD with 0.9 momentum and 10 4 weight decay as the optimizer" and encoder architectures like "Res Net18" and "Transformer", but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | We used SGD with 0.9 momentum and 10 4 weight decay as the optimizer. The learning rate was initially set to 10 3 and was multiplied by 0.1 every 70 epochs. For UR-Funny, we utilize the preprocessed data introduced by [25]. |