Towards Hands-Free Visual Dialog Interactive Recommendation

Authors: Tong Yu, Yilin Shen, Hongxia Jin1137-1144

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The empirical results show that the probability of finding the desired items by our system is about 3 times as high as that by the traditional interactive recommenders, after a few user interactions. Experiments Dataset and Online Evaluation We evaluate different approaches on the footwear dataset (Berg, Berg, and Shih 2010; Guo et al. 2018).
Researcher Affiliation Industry Tong Yu, Yilin Shen, Hongxia Jin Samsung Research America Mountain View, CA, USA {tong.yu, yilin.shen, hongxia.jin}@samsung.com
Pseudocode Yes Algorithm 1 presents our algorithm in a more general case. Algorithm 1 SPR bandit Input: λ, L, K, K , d 1 τ = 1, τ = 1, θ0 = 0 Rd 1, S0 = λ 1Id Rd d, xcenter = 0 Rd 1, B = [L] 2 forall t = 1, , n do 3 Sample the model parameters θt N( θt 1, St 1) 4 forall k = 1, , K do 5 at k arg maxe B {at 1, ,at k 1} x e θt 7 Recommend items At (at 1, , at K)
Open Source Code No The authors of (Guo et al. 2018) release the captioner codes in Github: https://github.com/Xiaoxiao Guo/fashion-retrieval. This link is for a third-party tool (captioner) used in their evaluation, not for the core methodology developed in this paper.
Open Datasets Yes We evaluate different approaches on the footwear dataset (Berg, Berg, and Shih 2010; Guo et al. 2018).
Dataset Splits No Similar to (Guo et al. 2018), we train the item identifier and visual dialog encoder on 10, 000 images, and evaluate our recommender in the online setting on another dataset with 4,658 images. While it mentions training on one dataset and evaluating on another, it does not provide explicit train/validation/test splits for reproducibility.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud computing specifications used for running the experiments.
Software Dependencies No The paper mentions neural network architectures like ResNet101, CNN, and GRU, but does not provide specific version numbers for any software libraries or frameworks used (e.g., TensorFlow, PyTorch, Python version).
Experiment Setup Yes The inputs are the hyper-parameter λ of the Gaussian distribution, the total number of items L, the size of list K, a hyperparameter K and the dimensionality of the image feature vector d. The size of the list is K = 10. We show the results up to n = 100 steps. Similar to (Guo et al. 2018), we train the item identifier and visual dialog encoder on 10, 000 images, and evaluate our recommender in the online setting on another dataset with 4,658 images.