FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval
Authors: Yanzhe Chen, Huasong Zhong, Xiangteng He, Yuxin Peng, Jiahuan Zhou, Lele Cheng
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments demonstrate our approach s state-of-the-art performance on four commonly used datasets. |
| Researcher Affiliation | Collaboration | 1Wangxuan Institute of Computer Technology, Peking University 2Kuaishou Technology |
| Pseudocode | No | Information insufficient. The paper describes its models and processes using text and mathematical equations, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | Information insufficient. The paper does not provide an explicit statement or link for the open-source code of its proposed method. |
| Open Datasets | Yes | We conduct extensive experiments on four commonly used datasets, namely Fashion IQ (Yu et al. 2020), Fashion200K (Liu et al. 2021), CIRR (Berg, Berg, and Shih 2010) and Shoes (Han et al. 2017). |
| Dataset Splits | Yes | CIRR (Liu et al. 2021): The dataset contains 21,552 real-world images from NLVR2 (Suhr et al. 2018). There are 36,554 triplets in total, divided into 3 subsets with 80% in training, 10% in validation, and 10% in testing. |
| Hardware Specification | Yes | We use 8 Tesla V100 GPUs for model training. |
| Software Dependencies | No | Information insufficient. The paper mentions the use of an optimizer (Adam) and model backbones (Res Net50x4, Vi TB/16) but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The initial learning rate is 4e-5, and we adopt a cosine annealing strategy to adjust it. The total number of training epochs is 50. We use Adam (Kingma and Ba 2014) to optimize the network with a mini-batch size of 1024. |