Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Test-Time Selective Adaptation for Uni-Modal Distribution Shift in Multi-Modal Data

Authors: Mingcai Chen, Baoming Zhang, Zongbo Han, Wenyu Jiang, Yanmeng Wang, Shuai Feng, Yuntao Du., Bingkun Bao

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we validate the effectiveness of our proposed method through extensive experimental evaluations. Code available at https://github.com/chenmc1996/Uni Modal-Distribution-Shift. [...] We perform experiments on multi-modal datasets with uni-modal distribution shift, and discovery limited performance gain. [...] Our method is validated through extensive experiments on the uni-modal distribution shifted datasets, and the results show that our approach achieves superior performance.
Researcher Affiliation	Academia	1Nanjing University of Posts and Telecommunications 2State Key Laboratory for Novel Software Technology at Nanjing University, Nanjing University 3College of Intelligence and Computing, Tianjin University 4Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR) & School of software, Shandong university. Correspondence to: Yuntao Du <EMAIL>, Bingkun Bao <EMAIL>.
Pseudocode	No	The paper describes the methodology using textual explanations and mathematical formulations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures with structured, code-like steps.
Open Source Code	Yes	Code available at https://github.com/chenmc1996/Uni Modal-Distribution-Shift.
Open Datasets	Yes	To validate the effectiveness of our method, we perform comparison experiments and on two multi-modal datasets, Kinetics50 (Kay et al., 2017) and VGGSound (Chen et al., 2020), with diverse domain shifts.
Dataset Splits	No	The paper mentions using the "training sets of Kinetics50 and VGGSound dataset" for pre-training and then evaluating on corrupted versions of these datasets. While it implies standard usage, it does not explicitly provide specific split percentages (e.g., 80/10/10) or sample counts for training, validation, and testing splits needed for reproducibility. It only details the dataset construction and corruption procedures, and video trimming.
Hardware Specification	Yes	We implement the network on a Ge Force RTX(TM) 3090 GPU and Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz.
Software Dependencies	No	For the software information and other experimental settings, please refer to our code https://github.com/chenmc1996/Uni-Modal-Distribution-Shift. The paper does not explicitly list specific software dependencies (e.g., Python, PyTorch, CUDA) with version numbers within the main text.
Experiment Setup	Yes	We mainly have the following hyper-parameters: The coefficient and threshold of self-training loss, the softmax temperature, the batch size. We use one set of hyper-parameters for the shift on one modality on each dataset ( we keep the temperature to 0.001 and loss coefficient as 0.5 across all experiments). For Kinetics50-C with video shift, the threshold as 0.9, the batch size as 16. For Kinetics50-C with audio shift, the threshold as 0.9, the batch size as 64. For VGGSound-C with video shift, the threshold as 0.8, the batch size as 128. For VGGSound-C with audio shift, the threshold as 0.8, the batch size as 64. During test-time, our model is updated using the Adam optimizer.