Uncertainty-Aware Alignment Network for Cross-Domain Video-Text Retrieval
Authors: Xiaoshuai Hao, Wanqian Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments in the context of domain-adaptive video-text retrieval demonstrate that our proposed method consistently outperforms multiple baselines, showing a superior generalization ability for target data. |
| Researcher Affiliation | Collaboration | Xiaoshuai Hao1, Wanqian Zhang2 1Samsung Research China Beijing (SRC-B) 2Institute of Information Engineering, Chinese Academy of Sciences |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. It provides mathematical formulations but no structured algorithmic steps. |
| Open Source Code | No | The paper does not include any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | Datasets. We use existing datasets across three domains to explore the UDAVR task, i.e., a comprehensive evaluation benchmark by combining three popular datasets: MSR-VTT (Mt)[59], MSVD (Md)[19], and TGIF (Tf)[33]. |
| Dataset Splits | Yes | We set the max epochs as 100, and early stop occurs if the validation performance does not improve in ten consecutive epochs. |
| Hardware Specification | Yes | All experiments are conducted five times for the average performance on a 2080Ti GPU server. |
| Software Dependencies | No | The paper mentions using Adam optimizer and employing the identical architecture for video and text encoders as used in GPO [5], but it does not specify version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | During the training procedure, we set the batch size to 64, and utilize a step-decayed learning rate with initialization value 0.0001. All experiments are conducted five times for the average performance on a 2080Ti GPU server. For all our experiments, we set a to -0.005 and b to 6. The hyper parameters λ1, λ2, K and T of the overall objective function loss is discussed extensively in section 4.3. We set the max epochs as 100, and early stop occurs if the validation performance does not improve in ten consecutive epochs. |