Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization

Authors: Yiyang Chen, Zhedong Zheng, Wei Ji, Leigang Qu, Tat-Seng Chua

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On the three public datasets, i.e., Fashion IQ, Fashion200k, and Shoes, the proposed method has achieved +4.03%, +3.38%, and +2.40% Recall@50 accuracy over a strong baseline, respectively.
Researcher Affiliation Academia Yiyang Chen 1,2 Zhedong Zheng3 Wei Ji1 Leigang Qu1 Tat-Seng Chua1 1 Sea-NEx T Joint Lab, National University of Singapore 2 Tsinghua University 3 Faculty of Science and Technology, and Institute of Collaborative Innovation, University of Macau
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The code is based on Pytorch (Paszke et al., 2019), and annoymous code is at 1. We will make our code open-source for reproducing all results. 1https://github.com/Monoxide-Chen/uncertainty_retrieval
Open Datasets Yes We verify the effectiveness of the proposed method on the fashion datasets, which collect the feedback from customers easily, including Fashion IQ (Wu et al., 2021), Fashion200k (Han et al., 2017) and Shoes (Guo et al., 2018).
Dataset Splits No The paper specifies training and test/evaluation splits for each dataset (e.g., 'training and test split', '10,000 samples for training and 4,658 samples for evaluation', '172,000 images for training... and 33,480 test queries for evaluation'). However, it does not explicitly provide details for a separate validation dataset split.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory, or processor types) used to run its experiments.
Software Dependencies No The paper states 'The code is based on Pytorch (Paszke et al., 2019)', but does not provide specific version numbers for PyTorch or any other ancillary software dependencies like Python or CUDA.
Experiment Setup Yes SGD optimizer (Robbins & Monro, 1951) is deployed with a mini-batch of 32 for 50 training epochs and the base learning rate is 2 10 2 following Lee et al. (2021). We apply the one-step learning rate scheduler to decay the learning rate by 10 at the 45th epoch. We empirically set w1 = 1, w2 = 1, which control the scale of augmenter generates Gaussian noise, and the initial balance weight γ0 = 1.