Mind-the-Gap! Unsupervised Domain Adaptation for Text-Video Retrieval
Authors: Qingchao Chen, Yang Liu, Samuel Albanie1072-1080
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we introduce the UDAVR text-video retrieval benchmark and evaluate the proposed framework. We first conduct a detailed domain shift analysis on UDAVR. Then, using this analysis, we select four adaptation directions from among the set of possible adaptations between the four domains. Next we compare our proposed CAPQ with existing retrieval and UDA methods. Finally, we present an ablation study to analyze the model configurations of the proposed methods. |
| Researcher Affiliation | Academia | 1 National Institute of Health Data Science, Peking University, Beijing, China 2 Wangxuan Institute of Computer Technology, Peking University, Beijing, China 3 Visual Geometry Group, University of Oxford, Oxford, UK 4 Department of Engineering Science, University of Oxford, Oxford, UK |
| Pseudocode | No | The paper describes the CAPQ framework with equations and descriptions of its components and their interactions, but it does not contain a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a direct link to a code repository for the methodology described. |
| Open Datasets | Yes | The UDAVR benchmark consists of 160k videos, sourced from four datasets spanning different domains; (1) Audio Visual Events from the MSR-VTT dataset (Xu et al. 2016), (2) Visual Events from the MSVD dataset (Chen and Dolan 2011), (3) Activities from the Activity Net Captions dataset (Krishna et al. 2017), (4) Movie Clips from LSMDC (Rohrbach et al. 2015). |
| Dataset Splits | Yes | We evaluate the proposed method using four splits (adaptation directions) of UDAVR defined in the previous section. We adopt the standard retrieval metrics of the target domain dataset, namely R@K (K = 1,10) and median rank (MR). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | No | The paper describes the overall framework and evaluation setup, but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed training configurations in the main text. |