Test-time Adaptation against Multi-modal Reliability Bias

Authors: Mouxing Yang, Yunfan Li, Changqing Zhang, Peng Hu, Xi Peng

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate the proposed READ on the audio-visual joint action recognition and event classification tasks under multi-modal TTA with reliability bias. The organization of this section is as follows. In Section 4.1, we present the experiment settings including benchmark construction and implementation details. In Section 4.2, we compare READ with the state-of-the-art (SOTA) TTA methods under different settings, revealing some observations. In Section 4.3, we perform ablation studies and analytic experiments to give a comprehensive understanding on READ.
Researcher Affiliation Academia Mouxing Yang1 Yunfan Li1 Changqing Zhang2 Peng Hu1 Xi Peng1 Sichuan University1 Tianjin University2
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code Yes The code and benchmarks are available at https://github.com/XLearning-SCU/2024-ICLR-READ.
Open Datasets Yes To facilitate the investigation of multi-modal TTA with reliability bias, we construct two benchmarks based on the widely-used multi-modal datasets Kinetics (Kay et al., 2017) and VGGSound (Chen et al., 2020).
Dataset Splits No The paper states the number of 'training pairs' and 'test pairs' for Kinetics50 (29,204 and 2,466 respectively) and mentions evaluation videos for VGGSound (14,046 testing pairs), but does not explicitly provide details for a separate 'validation' dataset split in terms of size, percentage, or specific usage for hyperparameter tuning.
Hardware Specification Yes All evaluations are run on Ubuntu 20.04 platform with NVIDIA 3090 GPUs.
Software Dependencies No The paper mentions 'Ubuntu 20.04 platform' but does not provide specific version numbers for key software components such as deep learning frameworks or libraries used in the experiments.
Experiment Setup Yes During the test-time adaptation phase, READ conducts online updates on specific parameters of the source models using the Adam optimizer. This process utilizes an initial learning rate of 0.0001 for every mini-batch of size 64 within a single epoch. The confidence threshold γ in Eq. 6 is fixed as e 1 for all settings.