Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval

Authors: Xiaoshuai Hao, Wanqian Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments in the context of domain-adaptive video-text retrieval demonstrate that our proposed method consistently outperforms multiple baselines, showing a superior generalization ability for target data.
Researcher Affiliation Collaboration Xiaoshuai Hao1, Wanqian Zhang2 1Samsung Research China Beijing (SRC-B) 2Institute of Information Engineering, Chinese Academy of Sciences
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. It provides mathematical formulations but no structured algorithmic steps.
Open Source Code No The paper does not include any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes Datasets. We use existing datasets across three domains to explore the UDAVR task, i.e., a comprehensive evaluation benchmark by combining three popular datasets: MSR-VTT (Mt)[59], MSVD (Md)[19], and TGIF (Tf)[33].
Dataset Splits Yes We set the max epochs as 100, and early stop occurs if the validation performance does not improve in ten consecutive epochs.
Hardware Specification Yes All experiments are conducted five times for the average performance on a 2080Ti GPU server.
Software Dependencies No The paper mentions using Adam optimizer and employing the identical architecture for video and text encoders as used in GPO [5], but it does not specify version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes During the training procedure, we set the batch size to 64, and utilize a step-decayed learning rate with initialization value 0.0001. All experiments are conducted five times for the average performance on a 2080Ti GPU server. For all our experiments, we set a to -0.005 and b to 6. The hyper parameters λ1, λ2, K and T of the overall objective function loss is discussed extensively in section 4.3. We set the max epochs as 100, and early stop occurs if the validation performance does not improve in ten consecutive epochs.