CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer
Authors: Yabing Wang, Fan Wang, Jianfeng Dong, Hao Luo
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed approach on two multilingual image-text datasets, Multi30K and MSCOCO, and one video-text dataset, VATEX. The results clearly demonstrate the effectiveness of our proposed method and its high potential for large-scale retrieval. |
| Researcher Affiliation | Collaboration | 1 Zhejiang Gongshang University 2 Xi an Jiaotong University 3 DAMO Academy, Alibaba Group 4 Hupan Lab, Zhejiang Province 5 Zhejiang Key Lab of E-Commerce |
| Pseudocode | No | The paper describes the methodology using mathematical equations and descriptive text, but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating the release of source code for the described methodology. |
| Open Datasets | Yes | We conduct experiments on two public multilingual image-text retrieval datasets (Multi30K and MSCOCO), as well as a video-text retrieval dataset (VATEX). (...) Multi30K (Elliott et al. 2016): (...) MSCOCO (Chen et al. 2015): (...) VATEX (Wang et al. 2019): |
| Dataset Splits | Yes | We adopt a similar data partition as (Young et al. 2014). (...) We follow the data split as in (Zhou et al. 2021). (...) We adopt a similar data partition as (Chen et al. 2020). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions models like CLIP and mBERT-base and an Adam optimizer, but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | Besides, we set λ = 0.6, α = 0.4, and τ = 0.07 in our experiments. The batch size is 128, and an Adam optimizer with an initial learning rate 2.5e-5 and adjustment schedule similar to (Luo et al. 2022) is utilized. |