Dynamic Weighted Combiner for Mixed-Modal Image Retrieval
Authors: Fuxiang Huang, Lei Zhang, Xiaowei Fu, Suqi Song
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments verify that our proposed model significantly outperforms state-of-the-art methods on real-world datasets. The source code is available at https://github.com/fuxianghuang1/DWC. Experiments on Fashion200K, Shoes, and Fashion IQ datasets show the outstanding performances. |
| Researcher Affiliation | Academia | Fuxiang Huang, Lei Zhang*, Xiaowei Fu, Suqi Song Learning Intelligence & Vision Essential (Li VE) Group School of Microelectronics and Communication Engineering, Chongqing University, China {huangfuxiang, leizhang, xwfu, songsuqi}@cqu.edu.cn |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The source code is available at https://github.com/fuxianghuang1/DWC. |
| Open Datasets | Yes | To evaluate our model, we chose three real-world datasets: Fashion200K (Han et al. 2017), Shoes (Guo et al. 2018), and Fashion IQ (Wu et al. 2021). |
| Dataset Splits | No | The paper mentions training, validation, and test phases but does not provide specific details on the dataset splits (e.g., percentages or sample counts) for any of these sets. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using CNN, LSTM, and CLIP models as components but does not specify any software libraries, frameworks (e.g., PyTorch, TensorFlow), or their version numbers. |
| Experiment Setup | No | The paper describes the model architecture and training process, including losses and mutual enhancement. However, it does not provide specific numerical values for hyperparameters such as learning rate, batch size, number of epochs, or the optimizer used. |