Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Asymmetric Reinforcing Against Multi-Modal Representation Bias

Authors: Xiyuan Gao, Bing Cao, Pengfei Zhu, Nannan Wang, Qinghua Hu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments validated our superiority on various multimodal classification datasets against the SOTAs. Comparison with Imbalanced Multimodal Learning Methods In this section, we compared ARM with advanced imbalanced multimodal learning methods to answer Q1: How does ARM narrow the modality contribution gap? Fig. 3 illustrates the trend of narrowing contribution gaps across different methods. Table 1 further reinforces this conclusion. ARM consistently outperforms other state-of-the-art methods, i.e., Greedy (Wu et al. 2022), OGM-GE (Peng et al. 2022), QMF (Zhang et al. 2023), PMR (Fan et al. 2023), Samplevaluation, Modality-valuation (Wei et al. 2024), and MLA (Zhang et al. 2024), achieving the competitive accuracy scores of 66.52% and 75.60%, respectively. Table 4 further validates these observations with an ablation study.
Researcher Affiliation	Academia	1College of Intelligence and Computing, Tianjin University, Tianjin, 300000, China 2The State Key Laboratory of Integrated Services Networks, Xidian University, Xi an, 710000, China EMAIL, EMAIL
Pseudocode	No	The paper describes the proposed method using mathematical formulations and descriptive text (e.g., equations for MI, CMI, and loss functions), but it does not include a distinct block explicitly labeled as "Pseudocode" or "Algorithm" with structured steps.
Open Source Code	No	The paper states: "More details of implementation and experiment analysis are provided in the Appendix." However, it does not contain an explicit statement about releasing the source code or provide a link to a code repository.
Open Datasets	Yes	Kinetic Sounds (KS) (Arandjelovic and Zisserman 2017) is a specifically designed action recognition dataset for research in audio-visual learning... UCF-51 is a subset of UCF-101 (Soomro, Zamir, and Shah 2012)... UPMC Food-101 (Wang et al. 2015) is a comprehensive dataset for food recognition
Dataset Splits	Yes	UPMC Food-101 (Wang et al. 2015) is a comprehensive dataset for food recognition, consisting of 101,000 images accompanied by corresponding texts across 101 food categories. Each category includes 750 images for training and 250 images for testing.
Hardware Specification	Yes	The experiments are conducted on Huawei Atlas 800 Training Server with CANN and NVIDIA 4090 GPU.
Software Dependencies	No	Unless otherwise specified, Res Net-18 is used as the backbone in the experiments and trained from scratch. Encoders used for UCF-51 are Image Net pre-trained. For Food-101, a Vi T-based model is used as the vision encoder, and a BERT-based model is used as the text encoder by the pre-trained. During training, we use Stochastic Gradient Descent (SGD)... The experiments are conducted on Huawei Atlas 800 Training Server with CANN and NVIDIA 4090 GPU. The paper mentions various models (ResNet-18, ViT, BERT), optimizers (SGD), and frameworks (CANN), but does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	Unless otherwise specified, Res Net-18 is used as the backbone in the experiments and trained from scratch. Encoders used for UCF-51 are Image Net pre-trained. For Food-101, a Vi T-based model is used as the vision encoder, and a BERT-based model is used as the text encoder by the pre-trained. Before modality valuation, a warm-up stage is employed for all experiments. During training, we use Stochastic Gradient Descent (SGD) with a batch size of 64. We set the initial learning rate, weight decay, and momentum parameters to 10 3, 5 10 4, and 0.9, respectively.